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^1 Abstract. In the random graph G{n,p) with pn bounded, the degrees of the vertices are 
almost i.i.d Poisson random variables with mean A := p{n — 1). Motivated by this fact, 
we introduce the Poisson cloning model Gpc{n,p) for random graphs in which the degrees 
. are i.i.d Poisson random variables with mean A. Then, we first establish a theorem that 
shows the new model is equivalent to the classical model G{n,p) in an asymptotic sense. 
Next, we introduce a useful algorithm, called the cut-off line algorithm, to generate the 
random graph Gpc{n,p). The Poisson cloning model Gpc{n,p) equipped with the cut-off 
line algorithm enables us to very precisely analyze the sizes of the largest component and 
the t-core of G{n,p). This new approach to the problems yields not only elegant proofs but 
also improved bounds that are essentially best possible. 
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We also consider the Poisson cloning models for random hypergraphs and random fc-SAT 
problems. Then, the t-core problem for random hypergraphs and the pure literal algorithm 
for random /c-SAT problems are analyzed. 

1 Introduction 

in 

^ ' The notion of a random graph was first introduced in 1947 by Erdos (25] to show the 
O ■ existence of a graph with a certain Ramsey property. A decade later, the theory of the 
random graph began with the paper entitled On Random Graphs I by Erdos and Renyi 
^ [27] , and the theory had been developed by a series [28l [291 EOl EH |32] of papers of them. Since 

^ ' then, the subject has become one of the most active research areas. Many researchers have 
devoted themselves to studying various properties of random graphs, such as the emergence 
of the giant component [23, El 1S2], the connectivity [271 1221 ESI, the existence of perfect 
matching [301 ED ESI US], the existence of Hamiltonian cycle (s) [SH [TOl [15], the fc-core 
problem [TOl [Ml [61]. and the graph invariants like the independence number [I^ [56] and the 
chromatic number [621 [121 [53] . (The list of references here is far from being exhaustive.) 

There are two canonical models for random graphs, both of which were originated in the 
simple model introduced in [2S]. In the binomial model G{n,p) on a set Vofn vertices, each 
of (2) possible edges is in the graph with probability p, independently of other edges. Thus, 

the probability of G{n,p) being a fixed graph G with m edges is p'^{l—p)^^^~"^. The uniform 
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model G{n, m) on is a graph chosen uniformly at random from the set of all graphs on V 



with m edges. Hence, G{n, m) becomes a fixed graph G with probability \^^) , provided 



G has m edges. Most of asymptotic behaviors of the two models are almost identical if their 
expected numbers of edges are the same. (See Proposition 1.13 in [B].) The random graph 
process, in which random edges are added one by one, is also extensively studied. For more 
about models and/or basics of random graphs, we recommend two books with the identical 
title Random Graphs by Bollobas [TT], and by Janson, Luczak and Rucihski [Hj. 

The phase transition phenomenon is among most interesting topics of random graphs. 
Specifically, the phase transition phenomena regarding the emergences of the giant (con- 
nected) component and the t-core problem have attracted much attention. In their mon- 
umental paper entitled On the Evolution of Random Graphs [28] , Erdos and Renyi proved 
that, for the size ii{n,p) of the largest component of G{n,p), 



where 6x is the positive solution of the equation 1 — 6 — e~ = 0. 

Why does the size of the largest component change so dramatically around A = 1? It 
was Karp [IB] who nicely explained the reason. To find a component G (v) of a fixed vertex 
V of G{n,p), one may first expose the vertices that are adjacent to v, and keep repeating 
the same procedure by taking each of those adjacent vertices: Initially, v is active and all 
other vertices are neutral. At each step, we take an active vertex w and expose all neutral 
vertices adjacent to w. This can be done by checking if {w,w'} G G{n,p) or not for all 
neutral vertices w'. Then, activate all neutral vertices that are adjacent to w. The vertex 
w is no longer active, and only non-activated neutral vertices remain neutral. The process 
terminates when there is no more active vertex. Clearly, the process will stop after finding 
all the vertices in the component containing v. Provided the number of neutral vertices does 
not decrease so fast, the number of newly activated vertices has a distribution close to that 
of the binomial random variable Bin(n — l,p), where 



Particularly, the mean of the number is close to pn. If pn < 1 — 6 for a fixed S > 0, then the 
process is expected to die out quickly almost every time. Thus, all C(f)'s are expected to be 
small. U pn > 1 + S, then the process may survive forever with positive probability. Hence, 
G{v) can be large with positive probability; as there are many (actually G(n)) trials, at least 
one of C(f )'s is expected to be large. Applying this approach to the random directed graph, 
Karp was able to prove a phase transition phenomenon for the size of the largest strong 
component. 

Notice that, when pn = 0(1), the distribution of Bin(?T, — l,p) is very close to the Poisson 
distribution with mean A := p{n — 1). Hence, we may further expect that the process 
described above could be approximated by the Galton- Watson branching process defined by 
a Poisson random variable Poi(A) with mean A, where 






Pr[Poi(A) =i] = e 



-X 



2 



Generally, the Galton- Watson branching process defined by a random variable X starts 
with a single unisexual organism. The organism will give birth to Xi children, where Xi 
is a random variable with the same distribution as X. The same but independent birth 
process continues from each of the children and the grandchildren and so on, until no more 
descendant exists. (For more information regarding Galton- Watson branching processes, one 
may refer [8].) For simplicity, we say the Poisson(A) branching process for the Galton- Watson 
branching process defined by Poi(A). 

The Poisson cloning model. To convert the above observation to a rigorous proof, it 
is needed to overcome or bypass two main obstacles. The first one is that the degrees of 
vertices of G{n, p) are not exactly i.i.d Poisson random variables. Though they have the same 
distribution as Bin(n — l,p), they are not mutually independent. For example, the sum of 
all degrees must be even as it is twice the number of edges. This cannot be guaranteed if 
the degrees are independent. The second one is that the number of neutral vertices keeps 
decreasing. Even if both obstacles do not cause substantial differences in many cases, one 
needs at least to keep tracking small differences for rigorous proofs. Since these kinds of 
small differences occur almost everywhere in the analysis, they sometimes make rigorous 
analysis significantly difficult, if not impossible. Furthermore, the fact that the number of 
neutral vertices decreases not only plays a crucial role but also yields a different result in 
the case that one wants know more precise behaviors. 

As an approach to bypass the first obstacle, we introduce the Poisson cloning model 
Gpc{n,p) for random graphs in which the degrees are i.i.d Poisson random variables with 
mean A = p{n — 1). Moreover, the new model is equivalent to the classical model G{n,p) in 
an asymptotic sense. Actually, defining the model is not extremely difficult: First take i.i.d 
Poisson A random variables d{v) indexed by all vertices v in V. Then take d{v) copies, or 
clones, of each vertex v. If the sum of (i(f)'s is even, then we generate a uniform random 
perfect matching on the set of all clones. An edge {v,w} is in the random graph Gpc{n,p) 
if a clone of v is matched to a clone of w in the random perfect matching. The resulting 
graph may or may not a simple graph. If the sum is odd, one may just take a graph with a 
self loop. Hence, the graph is always not simple if the sum is odd. In the next section, the 
Poisson cloning model is to be defined with details. 

It is also possible to extend the model to uniform hypergraphs, where a fc-uniform hy- 
pergraph on the vertex set ^ is a collection of subsets of V with size k. A graph is then a 
2-uniform hypergraph. In the binomial model H{n,p; k) for random fc-uniform hypergraphs, 
each of (2) edges is in the hypergraph with probability p, independently of other edges. The 
Poisson cloning model for random fc-uniform hypergraphs may be similarly defined and is 
denoted by Hpc{n,p; k). 

The following theorem shows that the new model is essentially equivalent to the binomial 
model. 

Theorem 1.1 Suppose k >2 and p = <d{n^^''). Then, for any collection Ti of k -uniform 
simple hypergraphs, 

c,PT[Hpc{n,p;k) eH] < PT[H{n,p ; k) eH] < c,(^PT[Hpc{n,p ; k) GT^jHe""), 
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where 



c, = k 



i/2el^(^)a)+^ft)+0(n-i/2) 




k 



) 



c,(A:-l)j +o(l) 



k-1 



and 0(1) g'oes to as n goes to infinity. 

To overcome the second obstacle, we present an algorithm, called the cut-off line algo- 
rithm, that enables us to generate the Poisson cloning model and analyze problems simulta- 
neously. As a consequence, the size of the largest component of Gpc{n,p) can be described 
very precisely. It is also possible to analyze the size of the t-core of the random hypergraph 
Hpc{n,p] k), where the t-core of a hypergraph is the largest subhypergraph with minimum 
degree at least t. 

The emergence of the giant component. After the phase transition result of Erdos and 
Renyi, it has remained to determine the size of the largest component when pn —>■ 1. Though 
Erdos and Renyi suggested that the size ii{n,p) of the largest component could be only 
O(logn), 6(n2/^), or 6(n), Bollobas [9] showed that ii{n,p) increases rather continuously 
by estimating it quite accurately for pn — l> n^^^^y/\ogn /2. Later Luczak [52J was able to 
estimate ii{n,p) for pn — 1 ^ n^^^^. 

In statements in theorems and lemmas, etc., of this paper, we use the following conven- 
tion. 

Convention: When we say that a statement is true for all a in the range a <^ a <^ 6, it 
actually means that there is (small) constant e > so that the statement is true for a in the 
range a/e < a < eb. 

Theorem 1.2 ^52j (Supercritical Phase) Suppose X = X{p, n) = 1+e with e ^ n^^^^ . Then, 
for large enough n, with probability at least 1 — 7{e^n/8)~^^^ , 



and all other components are smaller than n^/^. 

Using estimations for the number of connected graphs with certain numbers of vertices 
and edges, and the first and second moment methods, one may also obtain the following 
result for the subcritical phase. 

Theorem 1.3 (Subcritical Phase) Let \(n,p) = I — e with n^^^^ <^ e <^ 1. Then, for any 
positive constant 5 < 1/3 and large enough n, with probability at least 1 — {-^Y^"^, 



There have been many results regarding the structure of the largest component too, for 
which readers may refer [HI [52], [121 [55] and references therein. 

For Poisson branching processes, a duality principle has been known. A pair (/i. A) with 
/i < 1 < A is called a conjugate pair if /le"'^ = Ae~^. It is easy to see that /i = (1 — ^^^)A for a 
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2/3 



ii{n,p) - Oxn < 
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ii{n,p) 
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conjugate pair (/x, A). For a conjugate pair (/i, A), the distribution of the Poisson(A) branching 
process conditioned that the process dies out is exactly the same as that of the Poisson(/i) 
branching process. (See e.g. [B], pl64.) A similar but a little bit coarse duality was observed 
for the random graph G{n,p) and G{n*,p) with A = X{n,p) > 1 and n* = (1 — 6^)n. 
Notice that 1 — is the extinction probability for the Poisson(A) branching process. It has 
been known that the component sizes of G{n*,p) and those of G{n,p) excluding the largest 
component are the same in an asymptotic sense (see [^). 

The Poisson cloning model Gpc{n,p) equipped with the cut-off line algorithm enables 
us to not only estimate ii (n, p) more accurately but also establish a precise discrete duality 
principle: In the supercritical phase A := A(n, p) = 1 + e with n'^^^ <^ e <^ 1, Gpc{n,p) can 
be decomposed into three vertex disjoint graphs G, S and G whp (with high probability), 
where C is a connected graph of size about 9^n, 5 is a smaller graph of size about e~'^ <C 9^n, 
and G has the same distribution as Gpc{n* ,p*) with n* ^ (1 — and p* ~ p, which yields 
X{n*,p*) f» /i := (1 — In the subcritical phase X = 1 — e with ra"^^/^ e <^ 1, the 

largest component is of size 

log(£^n) — 2.5 loglog(e^n) + 0(1) 
-{e + \og{l-e)) 

whp. The precise statements are as follows. We concentrate on the cases e <^ 1 for which 
more careful analysis is required. It is believed that the proofs are easily modified for the 
cases of positive constants e. 

Theorem 1.4 Supercritical Phase: Let X := X{n,p) = 1 + e with n~^^^ <^ e <^ 1, ^ : = 
(1 — 9^)X and 1 <^ a <^ (e^nY^'^. Then, with probability 1 — e~^^" \ Gpc{n,p) may be 
decomposed into three vertex disjoint graphs G, S and G, where G is connected and 

e^n - oiinjefl'^ < \G\ < O^n + a{n/ef/^, 
and \S\ < and G has the same distribution as Gpc{n*,p*) for some n* and p* satisfying 
(1 - 9Jn - a{n/e f'^ <n* <{l- ^Jn + a{n/eY'^, 

and 



fx 



a{en)-^/^ < X{n*,p*) < ^ + a{eny^/\ 



Subcritical Phase: Suppose X := X{n,p) = 1—e with n <^ e Then, the size £f'^(n,p) 

of the largest component of Gpc{n,p) satisfies 



and 



Ft 



Ft 



^r(n,p) > 



£nn,p)< 



log(e^n) — 2.5 log log (e^n) + c 



-(£ + log(l-£)) 

log(e^n) — 2.5 loglog(e^n) 
-{e + \og{l-e)) 



< 2e"^(^) 



< 2e" 



for any positive constant c > 0. 
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Inside Window: Suppose A := \{n,p) = 1 + e with \e\ = 0(n^^^). Then, whp, 

e^^{n,p) = e(n2/3). 

(All constants in Q{-) 's do not depend on any of e, a and c.) 

By Theorem ll.il a corollary regarding G{n,p) follows. 

Corollary 1.5 Supercritical region: Suppose A = \{n,p) = 1 + e with n~^^^ <^ e <^ 1, and 
1 < « < {e^ny/"^. Then, in G{n,p), 

Pr[ \ii{n,p) - e^n\ > a{n/ey/^] < 2e-^^"'\ 
Moreover, for the size i2{n,p) of the second largest component and e* = 1 — (1 — ^;^)A, 

log((e*)^n) - 2.51oglog((£*)^n) + c 



Pr 



h{n,p) > 



and 



Ft 



h{n,p) < 



-{e* + log(l - £*)) 

log((£:*)^?2) - 2.51oglog((e*)^n) - c 
+log(l -£*)) 
for any positive constant c > 0. 

Subcritical region: Suppose X = 1 — e with n^^^'^ <^ e <^ I, then, 

log(£:^?2) — 2.5 loglog(£:'^n) + c 



< 2e^ 



and 



Pr 



Pr 



h{n,p) > 



h{n,p) < 



-(£ + log(l-£)) 
log(£:'^n) — 2.5 loglog(e'^n) — c 



< 2e-^(=\ 



< 2e" 



-{E + \0g{l-E)) 

for any positive constant c > 0. 

Inside Window: Suppose A := \{n,p) = 1 + e with \6\ = 0{n^^^). Then, whp, 

i^{n,p) = 6(^2/3). 



The emergence of the t-core. There are at least two possible directions to extend the 
problem of connected components. Observing that the minimum degree in a component 
must be larger than or equal to 1, one may consider subgraphs with minimum degree at 
least t > 2. For a graph G, the t-core is the largest subgraph with minimum degree at least 
t. As the minimum degree of the union of two subgraphs is at least the smaller minimum 
degree of the two, the t-core of a graph is unique. It is also easy to see that the t-core must 
be an induced subgraph. For this reason, the t-core of G sometimes refers to its vertex set. 
Denoted by Vt{G) is (the vertex set of) the t-core of G. As the 1-core Vi{G) is the set of 
all non-isolated vertices, we consider the cases t > 2 throughout this paper. If there is no 
subgraph with minimum degree t, the t-core is defined to be empty. 
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Another direction is to consider the t-connectivity, where a graph is t-connected if the 
graph remains connected after any t — 1 vertices are removed. Higher orders of connectivity 
have been used to understand various structures of graphs. Clearly, if a non-empty subgraph 
is t-connected, then its minimum degree must be t or larger. 

In 1984, Bollobas [10] initiated the study of t-core, t > 2, and observed that, provided 
t > 3 and pn is larger than a fixed constant, the t-core of G{n,p) is non-empty and t- 
connected whp. Luczak [SI] proved that, for t > 3, there is an absolute constant c such that 
the t-core of G{n,p) is either empty, or larger than cn and t-connected, whp. In particular, 
as far as the random graph G{n.p) is concerned, the t-core problem is the same as the t- 
connectivity problem. Moreover, if X{n,p) is less than 1, then the t-core of G{n,p) is empty 
whp as the size of the largest component is 0{n'^^^) whp. As p increases while n is fixed, 
the probability of the t-core of G{n,p) being non-empty keeps increasing. Let p^{n,6) be 
the infimum of all p that makes the probability larger than or equal to a constant 6 in the 
range < 5 < 1. Then BoUobas's result implies that np^{n,6) is bounded from above by a 
constant. Though np^{n,S) may still have no limit value as n goes to infinity, it seems to 
be more natural to expect that the limit exists. Furthermore, as it happens often in phase 
transition phenomena, the limit, if exists, is also expected to be independent of 6. In other 
words, the phase transition is expected to be sharp. 

For t = 2, the 2-core of a graph G is non-empty if and only if G contains a cycle. It is 
easy to see by the first moment method that G{n,p) with p = o{l/n) does not contain a 
cycle whp. For a constant c in the range < c < 1, G{n,p) with pn = c may or may not 
have a cycle with positive probability. Particularly, the phase transition for the existence of 
non-empty 2-core is not sharp. In the graph process {G{n,m))m=o,i..., in which a random 
edge is added one by one without repetition, Janson [41j found the limiting distribution for 
the length of the first cycle, especially he showed that the length is bounded whp. However, 
the expectation of the length is known to be Q{n^^^) due to Flajolet et al. [36j- The two facts 
are not contradicting each other, since there are random variables that are bounded whp, 
but their expectations are not. For example, Pr[X = 1] = 1 — 1/n and Pr[X = n"^] = 1/n. 

Bollobas pljj proved that, if t > 5 and X{n,p) := p{n — l) > max{67, 2t + 6}, then G{n,p) 
has a non-empty t-core. Chvatal introduced the notion of critical Xt, without proving 
existence, satisfying, as n goes to infinity. 



for any constant 6 > 0. He also proved A3 > 2.88, if exists, and claimed that A4 > 4.52 and 
A5 > 6.06 etc. could be proven by the same method. It is Pittel, Spencer and Wormald 
[6T] who proved a more general theorem that implies that At exists for fixed t > 3. They 
identified the values too. We present a slightly weaker version of the theorem. 

For a Poisson random variable Poi(p) with mean p, let P{p, i) = Pr[Poi(p) = i] and 




if X{n,p) < Xt - S 

1 if X{n,p) > Xt + 6, 



Q(p,z) = Pr[Poi(p)>2],i.e., 




00 



CO 




P(p,z) = e 



and Q(p,i) = J2PiP,j) = e-'Yl 



j=i j=i 
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and let 

At = mm — 

P>oQ{p,t-l] 

Theorem 1.6 Let t > 3, \{n,p) = p{n — 1). Then 



Pr 



G{n,p) has a non-empty t-core 



if X{n,p) < At - n^^ 

1 if \{n,p) > \t + n'\ 



for any 6 E (0,1/2), and the t-core when X{n,p) > At + n ^ has {1 + o{l))Q{9^X,t)n vertices, 
whp, where 9^ is the largest solution for the equation 

e-Q{ex,t-i) = 0. 

There have been much studies about the t-cores of various types of random graphs and 
random hypergraphs. Fernholz and Ramachandran [331 El] have studied random graphs 
conditions on given degree sequences. Cooper [22]| found the critical values for the t-cores of 
uniform multihypergraphs with given degree sequences that include the random fc-uniform 
hypergraph H{n, p;k). MoUoy [58] has considered cores for random hypergraphs and random 
satisfiability problems for Boolean formulas. Recently, Janson and M. J. Luczak [43J also 
gave seemingly simpler proofs for t-core problems that cover the result of Pittel, Spencer 
and Wormald. For more information and techniques used above mentioned papers, readers 
may refer [43j . 

Using the Poisson cloning model for random hypergraphs together with the cut-off line 
algorithm, we completely analyze the t-core problem for the random uniform hypergraph. 
We also believe that the cut-off line algorithm can be used to analyze the t-core problem for 
random hypergraphs conditioned on certain degree sequences as in [221 ESI EH US] • As the 
2-core of G{n,p) behaves quite differently from the other t-cores of H{n,p; k), we exclude 
the case k = t = 2. The case requires much more careful analysis and will be studied in a 
subsequent paper. 

The critical value for the problem turns out to be the minimum A such that there is a 
positive solution for the equation 

6 -Q{e''-^X,t-l) = 0. (1.1) 
It is not difficult to check that the minimum is 

A.t(fc,t):=min^^-^^^. (1.2) 

For A > Acrt(^, ^), let ^ be the largest solution for the equation — Q{9X,t — 1) = 0. 
Theorem 1.7 Let k,t >2, excluding k = t = 2, and a ^ rT^I'^ . 

Subcritical Phase: If X{n,p;k) := p{^Zl) = -^crt — o' is uniformly bounded from below by 
and i^{k,t) is the minimum i such that (*) > ti/k, then 

Vi[Vt{H{n,p]k)) ^ 0] < 2e"^("'") + 0{n~^'-^-''^^'^^^''^), 
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and, for any S > 0, 



FT[\Vt{H{n,p;k))\ > 6n] < 2e-^('^'") + 26"^^'^''^''''""). (1.3) 

Supercritical Phase: If X = \{n,p; k) = Acrt + cr is uniformly bounded from above, then, for 
all a in the range 1 <^ a <^ an^/^, 

Pi[\\Vt{n,p-k)\-Q{9,X,t)n\>a{n/aY/^]=e'''^''"\ 

and, for any i > t and the sets Vt{i) (resp. Wt{i) ) of vertices of degree i (resp. larger than 
or equal to i) in the t-core. 



Pr 

and 

Pr 



Vt{i)\- P{e^\i)n >6n\ < 2e-^(°^i^^^''^"''^'"», 



> 5n 



< 2e 



\Wt{i)\-Q{e,\i)n 
In paHicular, for p^^^ := 9x,,,(k,t)Krt{k,t) , 



with probability 1 — 2e f^(min{iog2 n.o-^n}) ^ Moreover, if all \ Vt{i)\, i > t, are given, each simple 
graph with the degree sequence induced by \Vt{i)\, i >t, is equally likely to be the t-core. 

As one might guess, we will prove a stronger theorem (Theorem 16. 2p for the Poisson 
cloning model Hpc{n,p; k), from which Theorem 11.71 easily follows. 

The pure literal algorithm for the random /c-SAT problem Recently the satisfiabil- 
ity problem for Boolean formulas has played a central role in the theory of computational 
complexity. An instance of the problem is a formula given by a conjunctive normal form 
(CNF), that is, a conjunction of disjunctions. Each disjunction, or clause, is of the form 
(?/i V ■ ■ ■ V yk), where yi's are chosen among 2n literals consisting of n Boolean variables and 
their negations. Given a formula, the problem is whether there exists an assignment of the n 
variables satisfying the formula. When such an assignment exists, the formula is satisfiable. 
Otherwise, it is unsatisfiable. A pair of literals y, z is strictly distinct if y is neither z nor 
the negation of z. When the input formula is restricted to have only clauses of k pairwise 
strictly distinct literals, called k-clauses, the problem is called the /c-SAT problem. 

It is known that the satisfiability problem is NP-complete |2T], so that determining 
whether an arbitrary formula is satisfiable or not is regarded very difficult (assuming P 
7^ NP), in the sense that it is at least as hard as any problem whose solutions can be 
verified in polynomial time. Cook |21j proved that even the fc-SAT problem for k > 3 is NP- 
complete too, while the 2-SAT problem can be solved in polynomial time. Given these facts, 
researchers have tried to find heuristic algorithms that are able to determine, in polynomial 
time, the satisfiability of most of /c-SAT formulas, especially 3-SAT formulas. Among others, 
a number of heuristic algorithms have been considered based on Davis-Putnam algorithm 
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Since one has to define "most" before sliowing tliat liis or lier algoritlim works for most 
fc-SAT formulas, random models for the fc-SAT problem have been introduced. 

The most common models for the random fc-SAT problems is the uniform model Fk{n, m) 
and Fp^{n,p] k). Here, F{n,m;k) is sampled uniformly at random from the set of all k- 
SAT formulas with n variables and m clauses. The other model may be constructed by 
selecting each fc-clause with probability p independently of all other clauses. The random 
formula F{n,p]k) is a conjunction of all selected clauses. Since there are '2^{^) /c-clauses 
all together, the expected number of clauses in the formula is nip := 2'^p(^). Other models 
include one formed by selecting a uniform random fc-clause m times with replacement. It is 
known that theses three models are essentially equivalent if rrip = m and m = B(n). 

Not surprisingly, the random 2-SAT and the random 3-SAT problems have been most 
extensively studied and many results have been published. For k = 2, Chvatal and Reed 
[20] , Goerdt and Fernandez de la Vega [35] independently proved that the random 2-SAT 



problem undergoes a phase transition at 2pn = 1, that is, for F{n,p) = F(n,p ; k), 

lim Pr[F(n, p) is satisfiablel = i ^ ^! J™sup„^ 2pr;, < 1 
n^oo L V 10 if hmmf^^ooSpra > 1. 

Since there are 2n literals, 2pn is the right parameter. Bollobas et al. [13] took more 
sophisticated approaches to determine the scaling window for the problem: Let p ^ n^^^'^. 
Then 

Pr[F(n, ^) is satisfiable] = 1 - e(l/p=^), 

and 

Pr[F(n, ^) is satisfiable] = e'^^^') 

Though it is believed that the random /c-SAT problem, k > 3, undergoes a similar phase 
transition, it remains as a conjecture. Only sharp transitions are known by a seminal result 
of Friedgut [32] • 

To find a satisfying assignment for a /c-SAT formula, one may apply the pure literal 
algorithm (PLA): A literal is pure in the formula if it belongs to one or more clauses of 
the formula, while its negation is in no clause. PLA keeps selecting a pure literal, setting 
it true, and removing clauses containing the literal as they are already satisfied. It stops 
when there is no more pure literal. We say that PLA succeeds if no clause remains in 
the formula after it stops, and it fails otherwise. Clearly, the formula is satisfiable if PLA 
succeeds. The converse is not true, for example, V A V z) is satisfiable whereas 
no pure literal exists. Broder, Frieze, and Upfal [18j analyzed PLA for the random 3-SAT 
problem to show that, for F^(n,p), if lim sup — < 1.63 then PLA succeeds whp, and if 
liminf — > 1.7 then it fails whp. Mitzenmacher [57] used the differential equation method 
introduced by Wormald [53] to claim, without rigorous proof, that the threshold for PLA 
exists and it is the solution of certain equations, which are somewhat complicated. That is, 
there is a constant c^., > 3, so that PLA succeeds whp if lim sup ^ < c^., and fails whp if 
liminf ^ > c^.. For more about upper bounds for the random satisfiability problems, readers 
may refer [T8llMll25l|371ll5lll7lllHl[50l[6l]- For more advanced algorithms than PLA, which 
give various improved lower bounds, one may refer [H El El [ISl ESj- And a variation of the 
random satisfiability called the (2 + p)-SAT problem can be found in [U | 
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For rigorous analysis of PLA, we consider the Poisson cloning model for the random 
fc-SAT problems, k > 2. Since a fc-clause can be regarded as a (hyper)edge consisting 
of k vertices, the Poisson cloning model Fpc{n,p ; k) can be defined as Hpc{2n,p ; k) on 
, X-^, x^, x^} without edges that contain both of a variable and its negation. Then 
it is not difficult to establish an asymptotic equivalence between Fpc{n,p; k) and F{n,p] k) 
from Theorem 11.11 The details can be found in the next section. 

Theorem 1.8 Suppose p = Q{n^^''). Then, for any collection T of k- SAT formulas, 

c*Pr[F^^(n,p;fc) G J^] < FT[F{n,p;k) G J^] < c;(Pr[F^^(n,p ; A;) G J^]^ +6""), 

where 

p(l-l/fc) Ik^l2n\ f k \ 



+ o 1 



(^)((^-iK) +0(1). 



and o(l) goes to as n goes infinity. 



As mentioned above, PLA undergoes a sharp phase transition. Actually, for A; > 3, it 
turns out that the phase transition is similar to that of the 2-core in the random hypergraphs. 
The case = 2 is similar to the 2-core problem of G{n, p) and will be studied in a subsequent 
paper. Let 



mm 



P 



mm ■ 



P 



-p\k-l ' 



7>o Q(p, 1)'=-! 7>o" (1 

and, for A > Xcrtik), let 9^ be the largest solution of the equation — 1 + e"^'*' = 0. 
Denote by Xii{n,p; k) is the set of variables in X := {x^, x„} whose truth values are not 
determined by PLA. The remaining formula is denoted by Fji{n,p; k). The residual degree 
dniz) of a literal z of XFi{n,p; k) is the number of clauses in Fii{n,p] k) containing z. It is 
easy to see that Fp{n,p; k) is independent of choices of pure literals when PLA is carried 
out. 

Theorem 1.9 Let X(n,p;k) = p(\"r/), A: > 3 and a > n"^/^. If X{n,p;k) < X{k) - a is 
uniformly bounded from below by and i^ik) is the minimum such that 2'=Q) > 2i/k, then 

Pr[XR{n,p;k) 7^ 0] < 2e-^("'") + 0(n-(i-2/fcK,(fc))_ 

Supercritical Phase: If X = X{n,p; k) = Xcrt + cr is uniformly bounded from above, then, for 

all a in the range 1 <^ a <^ an}!'^ , 

Pr[ I \XR{n,p;k)\ - (1 - e-'^^fn\ > a{n/ay/^] = e-^("'\ 

and, for any i,j>l and the sets Xp{i,i) (resp. Yp{i,j) ) of variables x G Xp{n,p ; k) with 
dnix) = i, dnix) = j (resp. dn{x) > i, dplx) > j). 



and 



Pr 



Pr 



IX 



R[hJ) 



P{9^X,z)P{9,X,j)n 



\Yn{t,j)\-Q{9^X,t)Qi9^X,j)n 



> 6n 



> Sn 



Moreover, if all \Xp{i,j)\, i,j > 1, are given, each formula with the degree sequence induced 
by \Xji{i,j)\, i,j > 1, is equally likely to be the residual formula Fp{n,p ; k) . 
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Like the t-core problem, we will prove a stronger theorem (Theorem I7.3p for the Poisson 
cloning model Fpc{n,p] k), from which Theorem 11.71 easily follows. 

The rest of the paper is organized as follows. In the next section, the Poisson cloning 
model is defined with details. The cut-off line algorithm and the cut-off line lemma are 
presented in Section [31 Section H] is for Chernoff type large derivation inequalities that will 
be used in most of our proofs. In Section O a generalized core is defined and the main lemma 
is presented. As the proof of Theorem 11.41 is more sophisticated. Theorems 11.71 and 11.91 are 
first proved in Sections E] and [71 Section [SI is for the proof of Theorem II. 4[ Closing remark 
follows in Section [3 

2 The Poisson Cloning Model 

In this section, we define the Poisson cloning models Gpc{n,p) for random graphs and 
generally Hpc{n,p; k) for random hypergraphs. Then, Theorem ll.il will be proven. 

To construct Gpc'{n,p), we first take i.i.d Poisson A = p{n — 1) random variables d{v) 
indexed by vertices v in the set V with \V\ = n. Then, take d{v) copies of each vertex 
V & V. The copies of v are called clones of v, or simply v-clones. Since the sum of Poisson 
random variables is also a Poisson random variable, the total number A^a '■= Ylvev^i"^) '^^ 
clones is a Poisson An random variable. It is sometimes convenient to take a reverse, but 
equivalent, construction. We first take a Poisson An = '^p{^) random variable A^a and then 
take A^A unlabelled clones. Each clone is to be independently labelled as f-clone uniformly 
at random, in the sense that v is chosen uniformly at random from V. It is well-known that 
the numbers d{v) of f-clones are i.i.d Poisson random variables with mean A. 

If A^A is even, the multigraph Gpc{n,p) is defined by generating a (uniform) random 
perfect matching of those A^a clones, and contracting clones of the same vertex. That is, if a 
f-clone and a w-clone are matched, then the edge {v, w} is in Gpc{n,p) with multiplicity. In 
case that v = w, it produces a loop that contributes 2 in the degree of v. If A'a is odd, we may 
define Gpc{n,p) to be any graph with a special loop that, unlike other loops, contributes 
only 1 in the degree of the corresponding vertex. In particular, if A'a is odd, Gpc{n,p) is 
not a simple graph. 

Strictly speaking, Gpc{n,p) varies depending on how to define it when A'a is odd. How- 
ever, if only simple graphs are concerned, the case of A^a being odd would not matter. For 
example, the probability that Gpc{n,p) is a simple graph with a component larger than O.ln 
does not depend on how Gpc{n,p) is defined when A^a is odd, as it is not a simple graph 
anyway. Generally, for any collection Q of simple graphs, the probability that Gpc{n,p) is in 
Q is totally independent of how Gpc{n,p) is defined when A^a is odd. Notice that properties 
of simple graphs are actually mean collections of simple graphs. Therefore, when properties 
of simple graphs are concerned, it is not necessary to describe Gpc{n,p) for odd A'a- 

Here are two specific ways to define Gpc{n,p). 
Example 2.1 One may keep matching two clones chosen uniformly at random among all 
unmatched clones. 

Example 2.2 One may keep choosing his or her favorite unmatched clone, and matching it 
to a clone selected uniform at random from all other unmatched clones. 
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If A^A is even, both examples would yield uniform random perfect matchings. If A^a odd, 
each of them would yield a matching and an unmatched clone. We may create the special 
loop consisting of the vertex for which the unmatched clone is labelled. More specific ways 
to choose random clones will be described in the next section. 

Generally for k > 3, the Poisson cloning model Hpc{n,p; k) for fc-uniform hypergraphs 
may be defined by the same way: We take i.i.d Poisson A = p(^lj) random variables d{v), 
V E V, and then take d{v) clones of each v. If A^a '■= J2v£V '^(^) divisible by k, the 
multihypergraph Hpc{n,p] k) is defined by generating a uniform random perfect matching 
consisting of /c-tuples of clones, and contracting clones of the same vertex. That is, if f^- 
clone, fj-clone, f ^.-clone are matched in the perfect matching, then the edge {f j, fj, f^,} 
is in Hpc{n,p; k) with multiplicity. If A^a is not divisible by k, Hpc{n,p; k) may be any hy- 
pergraph with a special edge consisting of A^a — k \_Nx/k\ vertices. In particular, Hpc{n, p ; k) 
is not fc-uniform when A^a is not divisible by k. Therefore, as long as properties of simple 
fc-uniform hypergraphs are concerned, we do not have to describe Hpc{n,p] k) when A^a is 
not divisible by k. 

We show that the Poisson cloning model Hpc{n, p;k), k > 2, is contiguous to the classical 
model H{n,p; k) when the expected average degree is a constant. 

Theorem 11.11 (Restated) Suppose k > 2 and p = Q{n^^''). Then, for any collection Ti. of 
k-uniform simple hypergraphs, 



and o(l) goes to as n goes to infinity. 

Proof. We assume that the random perfect matching is generated by keeping choosing k 
unlabelled clones and labelling them uniformly at random, as any other way to generate it is 
equivalent provided A^a is divisible by k. Let if be a fixed simple /c-uniform hypergraph with 
m edges. Then Hpc{n,p; k) = H if and only if A^a = km and the km clones are labelled so 
that H is yielded after contraction. The first k clones are labelled to be one of the m edges 
with probability m-^^ . . . i = m-^, and the second k clones are labelled to be one of the 
remaining m — 1 edges with probability (m — 1)-^, and so on. That is. 




where 



c, = 





As A^A is a Poisson random variable with mean Xn = = kp{^), we have 




e 
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unless m — 0. Therefore, 
implies that 

Fr[Hpc{n,p;k) = H] ^ pm+l^(l)+0(p''n>'+p^m+l/m) 

Vi[H{n,p]k) = H] 



Since 



(27rm)-V2^!f-r = l + 0(i) = e«(VH, and ^ f?! = e-(^+®(Vn))0^ 
\m/ n'^ \k J 

we conclude that 

Vi[Hpc{n,p-k) = H] _ ,-i/2(-p{l)P;;!&^Y-\-a^^^ 
Pr[H(n,p;k)^H] ~ \ (f)- ) 

For the case m = 0, we replace 1/m in the last O(-) term by Then, it is easy to see 
Pr[Hpc{n,p-k) = H] ^ y^/ (i+e(i))(^)^-pn.+^(^)+o(p3n'-+pW^) 

(2.1) 

for all m in the range < m < g). Let i?^ = e"^UJ^l^^, or equivalently i?„ = 

e-^(^)'" by An = A;pg). Then ^ = (1 + C>(^))^. This gives that has its 
maximum 1 + O(^) when m — ^ -\- assuming A = ©(1), or p = Qin}"^). Moreover, 

it is not difficult to show that 

i?„ = (l + 0(JL))e ^2a;^ if \m-^\<n, and < e'^d*^"^-^"!) otherwise. 

Hence 

r r|x7 1 77/, p fv ) — -O I 

which yields 

Pr[H{n,p; k)^H\> c, Pr[Hpc{n,p] k) ^ H\. 

Thus, 



Px[H{n,p]k) = J] Pr[i/(n,p;A;) = ii"] 

> J] c, Pr[i7pc(n,p ; A;) = i/] = c, PT[Hpc{n,p- k) eH]. 



HeH 
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For the upper bound, take the minimum > p(2) such that 

e-(™i-P©)((2)/«+p)i?^^^ < (^c,{k-l)PT[Hpc{n,p;k) G 7^])'^' + 6"". (2.2) 

It is routine to check that = Q{n). Let 

Hi = {H G Ti : the number of edges in H is at least 

Then 

Vi[H{n,p]k) eUi] = Y FT[H{n,p;k) = H]+ J] ^^t^^^'-^ ' ^) = ^] 



|Hl=m 



< Pr 



n 



Bin ( I ^ ) , p ) > mj 



^ J2 FT[H{n,p;k) = H]. 

"i: Hew 



For m> m. 



Pr 


Bin(^ 




= m 




Pr 


; Bin(G; 




= 


= m — 1 



< (l+0(p)) 



imphes that 



Pr 



n 



Bin( I ^ ) ,P) > 



< (1/2 + o(l))(27rmJ^/2pr 



Bin( ( ^ ) ,p j = 



(2.3) 



and, if ~ ^ "-^''^ 
Pr 



Bin M ^ ) , p ) > 



< Pr 



Bin( {^j,pj =m^ 



(2.4) 



Observe that 
Pr 



Bin I \ ^],P} = 



-e V 2 



)/(fc)+P™i-#(fe)+0(l/™) 



8jS Rm^ 6 ^ '"-1 ^m^ 



(27rmJ^/"'(-^)'^-i 

= (1 + o(i))(27rmj-i/2i?^^e-C".0/(;:)+^^-i~^ (::)+o(i/n)^ 

KIOCS^. Since 



22D together with (Q and ([23D gives 



k) 2( 



i^(m,-p("))2 + 0(l/n). 



Pr 



n 



Bin ( I ^ ) , J9 ) > 



< (l/2 + o(l))((c,(A;-l)Pr[/7pc(n,p;A;) G7^])'^' + e-"). (2.5) 
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For m in the range p(^) < m < m^, 

e-(-i-pft))((2)/"+p)i?^ > [c,{k-l)¥i[Hpc{n,p-k) G H])^'^ . 
Provided H has m edges, (12.11) yields 
¥r[H{n,p-k) = H] = (l + o(l))A;i/2^^-'=e(2)v+P— ^©Pr[i7pc(n,p;A;) = i7] 

2 1-* 

< (1 + o(l))A;i/2gf(^)(^)+Vft)(^c,(A;- l)Pr[ifpc7(n,p;A;) G Ti:]) ' 
X Pr[/fpc(n,p;A;) = ff]. 

Hence 



k 



J2 5^ Pr[i/(n,p;fc) = i7] < {c,+o{l)){c,{k-l)PT[Hpc{n,p;k)en] 

X FT[Hpcin,p;k)eni]. 

This together with (12.51) gives 

Pr[H{n,p;k) eHi] < (1/2 + 0(1)) (^(^c,{k - I) Fi[Hpc{n,p ; k) e n]y^\ e-^(^)^ 

l-k 

+ {c,+o{l))(c,{k-l)PT[Hpc{n,p-k)en]^ " Pi[Hpcin,p-k)eni] 
Similarly, for the maximum < p(^) such that 

Rm, < (c, {k - 1) FT[Hpc{n, p-k)en])'\ e-^ 

and 

7i2 = {H G H : the number of edges in H is less than p(^)}, 

we have 

i/fc 



PT[H{n,p;k)en2] < il/2 + o{l))(^(^c,ik-l)Fi[Hpcin,p;k)en]^ +e 

i-fc 

+ {c,+o{l))(c,{k-l)Pi[Hpc{n,p-k)en]^ ' Pi[Hpcin,p-k) en 
As FT[H{n,p;k) E H] = FT[H{n,p;k) G Hi] + Pr[/7(n,p ; A;) G T^s], we finally have 

FT[H{n,p;k) eH] < (1 + o(l)) (^(q(A; - 1) Pr[i/pc(n,p ; /c) G H])'^' + e"") 

+ (c, +o(l))(c,(A;-l)Pr[ifpc;(n,p;A;)G7^]) ' Pr [/Jpc (n, p ; A;) G 7^] 
< c,(^PT[Hpcin,p;k) en]^/" + 6-'' 
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□ 

Theorem 1 1 ■ 1 1 may be generalized in the case that there are some small number of forbidden 
edges. For example, the random fc-SAT formula F{n, p ; k) may be regarded as Hpc{2n, p ; k) 
on V = {x^, x^, x„} without edges that contain both of a variable and its negation. 
Suppose there is a set B of forbidden edges with \B\ = f ()!) for /3 = 0(1). Each edge not in 
B is in the random fc-uniform hypergraph H'^^\n,p; k) with probability p independently of 
all other edges. 

Theorem 2.3 Suppose k > 2, p = 0(ni~'=) and B is a set of ^{l) with j3 = 0(1). Then, 
for any collection Ti of simple k-uniform hypergraphs without edges in B, 



c,{(3)PT[Hpc{n,p;k) en] <Vi[H^''\n,p-k) eU] <c,{l3)[Vi[Hpc{n,p-k)en]T^+e 
where 

p(/3 + o(l)) (n\ , p(/3 + °(l)) (n\ 

CriP) = c,e " UJ, c,{p) = c,e - UJ. 

Proof. The result follows since 

Pr[g(^)(n,p;fc) = i7] ^ |^| ^ £m±ii(i))(») 

PT[H{n,p;k) = H] ^ 

for G 7i implies that 

Pr[H^^\n,p;k) eH] =e — ^ — U) Pr[i7(n,p ; A;) eH]. 



For the random /c-SAT formulas, suppose P^. satisfies 

ie:)Kt)-K:)' — l-i^ 

Then 

2''(^) 1 (k\^ 1 (k 



To Uv 



□ 



e 



implies that 

=^ , , 

Corollary 2.4 If k > 2 and p = <d{n^^''), then, for any collection T of k-SAT formulas, 

< PT[Fpc{n,p- k)eT]< Pr[F(n,p ; k) e T] < cl{Pi[Fpc{n,p- k) e T]'^ + e'") , 
where 

cl = k^/^e^Mi'kh4{'k) + 0(1), < = e^©m (^) ((A: - 1)<)''' + o(l). 



□ 
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3 The Poisson A-Cell and the Cut-OfF Line Algorithm 



To generate a uniform random perfect matching of A^a clones, we may keep matching k 
unmatched clones uniformly at random (cf. Example 12 .ip . Another way is to choose the 
first clone as we like and match it to A; — 1 clones selected uniformly at random among all 
other unmatched clones (cf. Example I2.2p . As there are many ways to choose the first clone, 
we may take a way that makes the given problem easier to analyze. Formally, a sequence 
S = {Si) of choice functions determines how to choose the first clone at each step, that is. 
Si tells which unmatched clone is to be the first clone to form the i^^ edge in the random 
perfect matching. A choice function may be deterministic or random. If less than k colones 
remain unmatched, the edge consisting of those clones will be added. The clone chosen by 
Si is called the i^^ chosen clone, or simply a chosen clone. 

We also present a more specific way to select the k — 1 random clones to be matched 
to the chosen clone. The way presented here will be useful to solve problems mentioned in 
the introduction. First, independently assign to each clone a uniform random real number 
between and A = p{^Zl)- For the sake of convenience, a clone is called the largest, the 
smallest, etc. if so is the number assigned to it. In addition, for < 6 < 1, a clone is called 
6X-large (resp. OX-smalt) if its assigned number is larger than or equal to (resp. smaller than) 
^A. To visualize the labelled clones with assigned numbers, one may consider n horizontal 
line segments from (0,j) to (A,j), j = 0, ...,n — 1 in the two-dimensional plane M^. The 
f^-clone with assigned number x can be regarded as the point {x,j) in the corresponding 
line segment. Then, each line segment with the points corresponding to clones with assigned 
numbers is an independent Poisson arrival process with density 1, up to time A. The set of 
these Poisson arrival processes is called a Poisson {X,n)-cell, or simply a X-cell. 

We will consider sequences of choice functions that choose an unmatched clone without 
changing the joint distribution of the numbers assigned to all other unmatched clones. Such 
a choice function is called oblivious. A sequence of oblivious choice functions are also called 
oblivious. The choice function that chooses the largest unmatched clone is not oblivious, as 
the numbers assigned to the other clones must be smaller than the largest assigned number. 
For an instance of an oblivious choice function, one may consider the choice function that 
chooses a f-clone for a vertex v with fewer than 3 unmatched clones. For a more general 
example, let a vertex v and its clones be t-light if there are fewer than t unmatched f-clones. 

Example 3.1 Suppose there is an order of all clones that is independent of the assigned 
numbers. The sequence of the choice functions that choose the first t-light clone is oblivious. 

A cut-off line algorithm is determined by a sequence of oblivious choice functions. Once 
a clone is obliviously chosen, the largest k — 1 clones among all unmatched clones are to 
be matched to the chosen clone. This may be further implemented by moving the cut-off 
line to the left until k — 1 vertices are found: Initially, the cut-off line of the A-cell is the 
vertical line in containing the point (A,0). The initial cut-off value, or cut-off number, 
is A. At the first step, once the chosen clone is given, move the cut-off line to the left until 
exactly k — 1 unmatched clones, excluding the chosen clone, are on or in the right side of the 
line. These k — 1 clones together with the chosen clone form the first edge in the random 
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perfect matching. The new cut-off value Ai is to be the assigned number to the {k — 
largest clone. Here we assumed that no two distinct clones are assigned the same number 
as the probability of such an event is 0. The new cut-off line is, of course, the vertical line 
containing (Ai,0). Repeating this procedure, one may obtain the i''^ cut-off value Aj and 
the corresponding cut-off line. 

Notice that, after the i^^ step ends with the cut-off value Aj, all numbers assigned to 
unmatched clones are i.i.d uniform random numbers between to Aj, as the choice functions 
are oblivious. Let Ui be the number of unmatched clones after step i. That is, f/j = Nx — ik. 
Since the {i + 1)*'^ choice function tells how to choose the first clone to form the {i + 1)**^ 
edge without changing the distribution of the assigned numbers, the distribution of Aj+i is 
the distribution of the {k — 1)*^ largest number among C/j — 1 independent uniform random 
numbers between and Aj. Let 1 — Tj be the random variable representing the largest 
number among j independent uniform random numbers between and 1. Or equivalently 
in distribution sense, Tj is the random variable representing the smallest number among the 
random numbers. Then the largest number among the C/j — 1 random numbers has the same 
distribution as Aj(l — Tu^^i). Repeating this k — 1 times, we have 

Aj+i = Aj(l - - •••(!- Tu,-k+i), 

and hence 

Aj+i = A,(l-Tt;^-i)---(l-T^,-fc+i) 

= A,_i(l - Tu^_^_,) ■■■{!- Tu,_,-k+i) ■ (1 - Tu,-i) ■■■{!- Tu^-^+i) 

Nx-{i+l)k+l 

=A n (i-r,). 

It is crucial to observe that, once Nx is given, all Tj are mutually independent random 
variables. This makes the random variable Aj highly concentrated near its mean, which 
enables us to develop theories as if Aj were a constant. The cut-off value Aj will provide 
enough information to resolve some otherwise difficult problems. 

In the next section, we will prove the following slightly general lemma regarding the 
concentration of ^(1 — Tj). 

Lemma 3.2 For positive integer k, let Tj 's be mutually independent, j = N, N ~1, .... N — Ik 
with N — Ik ^ 1, and let R be a non-empty subset of {0, 1, ...,k — 1} with \R\ = r. Then, 
denoting 9^ = {1 — ik/NY^'', we have, for £ < 0.1, 



Pr 



l+o(l) - 



max 

i:l<i<l 



n (l - T,) - e:\ >e]< lOe-^-'-^''''''^\ 



j=N 



In particular, if 9^ = then 



Pr 



max 

i:l<i<l 



-f7(min{£Ar,57f^}) 



n {^-T)-e: 



j=N 



> e 



< lOe ' ' 
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The cut-off line lemma follows from Lemma [3l2l For 9 in the range < 6' < 1, let A(6') be 

k 

the cut-off value when {l — 9T^)\n or more clones are matched for the first time. Conversely, 
let N{9) be the number of matched clones until the cut-off line reaches ^A. 

Lemma 3.3 ( Cut- ojf Line Lemma) Let k >2 and X > be fixed. Then, for 6*^ < 1 uniformly 
bounded below from and < A < n, 



Ft 



max \A(9)-9\\ > ^ 

9:fl<e<l " 



< 2e 



-n(mm{A,^^}) 



and 



Ft 



max \N{9)-{l-9—)\n\>A 



<e<i 



< 2e 



-n(min{A,^^^}) 



Proof. Suppose A^a = Xn + h is given. As A^a is a Poisson An random variable, 



Pr 



I A^A — Aral > c min < n 



A 



1-9 



< 2e 



where c is a (small) constant to be specified later. Hence it is enough to consider h in the 
range \h\ < cmin{n, j^}- 



k 

For 9^<9 <1, let be the solution of the equation (1 - 9T^)Xn = (1 - C,e~^)Nx. Then 

^. ^ (i - - f )^" )^ ^ + (1 - ^^J'^ y ^ e + o(il^) 

' V A^>, / V Xn + h J \ Xn + h J 

implies that C,g is uniformly bounded from below by (for small enough c) . Lemma 13.21 gives 

-n(min{A, ,^ , }) 



Pr 



max |A(e)-e,A|> 



<6K1 



A_ 

s'"l — 2n 



Nx = Xn + h 



< 2e 



Taking small enough c, we also have \C,g — 9\ < and 

|A(9)-9A| < |A(9)-{,A|+A|«.-e| 
< |A(«)-?,A| + ^. 

Therefore, if |A(^^) — ^A| > ^ then |A(^^) — ^gX\ > ^ and hence the probability that such 9 
exists in the range < < 1 is at most 

^^-Q(mm{A,^^4i^}) ^ ^^-n(mm{A, }) ^ ^^-nCmini A, ^^}) _ 

For the second inequality, it is enough to observe that |A^(^^) — (1 — 9'i^)Xn\ > A implies 
that A(^ + n{A/n)) < 9X or A{9 - n{A/n)) > 9X. 

□ 

For the Poisson A-cell conditioned on A^a = A^, a similar lemma may be obtained. 
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Lemma 3.4 (Cut-off Line Lemma for N clones) Let k > 2, X > be fixed. Then, for the 
Poisson X-cell conditioned on Nx = N, and for 6-^ < 1 uniformly bounded below from and 
0<A<N, 



Pr 



max \A(e)-eX\ > % 



)N 



}) 



and 



Pr 



max lA^m - (l-^^)A^I > A 
9:e^<e<i 



< 2e 



4 Large Deviation Inequalities 

In this section, a generalized Chernoff bound and an inequality for random processes are 
to be shown. Let Xi, ...,Xm be a sequence of random variables such that the distribution 
of Xi is determined if all the values of Xi, Xi_i are known. For example, Xi = A{6i) 
with 1 > ^1 > ■ ■ ■ > 6'm > in a Poisson A-cell. If the upper and/or lower bounds are 
known for the conditional means E[Xi\Xi, and for the conditional second and third 
moments, then Chernoff type large deviation inequalities may be obtained not only for 
YlJLi but mini<j<m J2]=i -^j and/or maxi<j<m Yl]=i ^j- Large deviation inequalities for 
such minimums or maximums are especially useful in various situations. Lemma 13.21 can be 
shown using such inequalities too. 



Lemma 4.1 Let Xi, ...,Xm be a sequence of random variables. Suppose 

E[Xi\Xi, Xi_i] < f^i, 
and there are positive constants a-, b., and so that 

E[{Xi — fiiY\Xi, Xj_i] < Oj, 

and 

E[{Xi - /i,)3e«(^-'^")|Xi, ...,X,_i] < k for all < ^ < 4- 
Then for any a with < a < ■ColX^i^i Q-j)"'^^^; 



(4.1) 

(4.2) 
(4.3) 



i=l 



i=l 



1/2- 



a 



i=l 



< exp ( -yd 



3(Er=i«o^/^ 



Similarly, 

E[Xi\Xi, Xi^i] > fii 

together with Iji4-^ '^''^d 

E[{X, - /i,)3e«(^'-'^')|Xi, > k for all -^^ < ^ < 

implies that 



(4.4) 
(4.5) 



mm m 

Pr [ ^ Xi < ^ /ii - a a, 

i=l i=l i=l 



< 



exp 



3(Er=i«o^/^ 
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Proof. We first show that 

E[e^T.Uix,-^o)] < J:W^/2+I?PE-=i^./6 for ^ = 0, 

using induction. As 



it is enough to show 



ce;=i(^.-m.) 



E 
E 



For < ^ ^ 5 Tayfor theorem gives 

^^g?v ^»^|Xi,...,Ai_iJ = H \ 



2^6- 



for some ^* between and ^. 

Let e = a(E::i <Cr Then 



i^[e«E-.(x.-M.)]<exp^ + 



2 ' Q{YZl^^f" 



and 



Pr I > ■X,-> ' jij > a(^^a. 



l/2n 



< ^[e?(Er=i{^.-/'.)-"(E™ia.)^/^)] < 



exp 



a 



2 6(Er=i«.)'/^^' 

Similarly, (jM]) together with (g^D, and ^ = together with (gSD gives 



Pr 



m m m i 



< ^[e^(E7^i(^.-^.)+"(E™i-»)^)] <exp 



i=l 



2 6(Er=i«o^/^^' 
□ 



As it is sometimes tedious to point out the value of a and to check the required bounds 
for it, the following forms of inequalities are often more convenient. 

Corollary 4.2 (Generalized Chernoff bound) If SC,^ Yl^i — Yl^i /^^ some < 6 < 1, then 
(gjp-g;! zmply 

m m 

Pr > + < e-^-''^^'5«o^.«VEr=iM^ 

1=1 i=l 

for all R > 0. Similarly, If — — /^'^ some < 6 < 1, then l\4-^ , ^''^^ 
( |^.5[ j yield 

m m 

" < g-imin{5CoR, «VEI^i<^.} 



i=l 



for allR> 0. 
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Proof. For R < (5^^ Oj, Lemma WA\ with a = RiJ2^i ^i) gives 



m m 



-1/2 



1=1 i=l 

If i? > "^iLi '^iy one may replace by a* > a- satisfying ^ a* = R/{S^o) and obtain 



Pr [5^X,-5^^, >i? 



i=l 



Pr 



1=1 



i=l 



1=1 



112 (R_V'^ 



exp(-(54i?/3), 



as 



3(Ea*)3/2 3(i?/(5ej)3/2 3i? - 3R " 3' 

□ 

We now try to obtain inequalities for the maximum and the minimum of the random 
process. Let Xe, 6* > 0, be random variables, which are possibly set- valued. Here 6 may be 
real numbers as well as non-negative integers. For 6 > 0, suppose r(^) is a random variable 
depending on {Xgi}gi^g and 9, and 

tp = tp{{Xg>}e><g^; ^0' ^i)' and tpg = ^g{{Xg>}g><g^; 9^^, 9, 6'J. 

The random variables ip and tpg are to be used to bound r(^^). 

Example 4.3 Let Xi,X2,... be i.i.d Bernoulli random variables with mean p and Si = 
J2j=i -^j- r(0 = I'S'i — ip\ and 



Then, since 
we have 



ip = T{n) and ipi = \Sn — Si — {n — i)p\. 
Si - ip = Sn - np - {Sn - Si - {n - i)p), 

r{t)<^ + ^i. 



Example 4.4 Consider the X-cell on n vertices defined in the previous section. Let vg be 
the vertex that has its largest clone at (1 — 9)X. If such a vertex does not exist, vg is defined 
to be assuming K ^ V^. As there is no possibility that two distinct clones are assigned the 
same number, vg is well-defined. Let Xg = vg and V{9) be the set of vertices that contain 
no clone larger than or equal to (1 — 9)X. That is, V{9) = V \ {vgi : < 9' < 9}. Clearly, 
E[\V{9)\] = e-^^n. Observing that, for 9^<9 < 9^, 



\V{9)\-e-'^n 



< 



\V{9,)\-e-'^^n + \V{9,)\-e-^'^-'^^\V{9)\ 



we may set T{9) = \\V{9)\ - e~^^n|, 

^ = e('i-'o)^r(^J, and ^g = e^'i-'o)^ \V{9,)\ - e-^'^-'^^\V{9)\ 
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We bound the probability maxg^<g^g^T{9) > R and mmg^<g^g^T{9) < R under some 
conditions. 

Lemma 4.5 Let < 9^ < 9^, R = Ri + R2, Ri,R2 > and $0 be events depending on 

{Xg'}g'<g. If 

T{9)<ij + ijg, W9,<9<9„ 



then 



Ft 



max T{9)>R 



< Ft 



ij>Ri 



Pr [ U $ 6 



9;6»,)<6l<6i^ 



+ max max l{^g)FT llpg > R2 {Xg>}g/<g 



Similarly, if 
then 

Pr 



T{9)>i, + ije, y9,<9<9„ 



min r(^) < -R 



< Ft 



ip < -Ri 



+ Ft\ U % 



+ max max l($6))Pr \il)g < -R2 {Xg'}e><e 



Example 14.31 (continued) As 



Ft[iP > Ri] < e-^^'"'"^^^'^!^^) 



and 



Pr[^i > R2\Xi,...,Xi] = FT[tlJi> R2] < e-^(™'^{«2,^(i_^)(„_,)})^ 
Lemma \4^ for Ri = R2 = R/2 and $e = gives 

Pr[ max \Si — pi\ > R] < e'^^^^'^^^' p(^-p^"^\ 



i:0<i<n 



Example 14.41 (continued) Since 

\V{9)\ = l{v has no (1 — 9)X-large clone) 



vev 



is a sum of i.i.d Bernoulli random variables with mean e 



Pr 



especially 



\V{9)\-e-'^n 
Pr V > 



> R 



< 2e 



-n(min{i?,f-}) 



24 



Once {Xqi}qi<q is given, V{9) is determined and 



V{9^ = l{v has no (1 — 6^\-large clone) 

is a sum of i.i.d Bernoulli random variables with mean e~^^^~^^^. Thus, 



Ft 



> R/2 {Xe'}e'<e 



< 2e 



and Lemma \4.5\ for 6^^ = and $0 = yields 



Ft 



max 

6»:0<e<fl 



> R 



< 2e 



The proof of Lemma 14.51 follows. 

Proof of Lemma 14.51 Let r be the first time 9 in the range 9^^ < 9 < 9^ when T{9) > R. If 
no such 9 exists, r = 00 and ipr = —00. Observe that 



Pr 



r < 00 



< Pr 

< Pr 



T<oo, ^<Ri, Pi <l>ej + Pr[^/^ > + Pr ^ |J $6 
A > R2, n ^e] + Pr[^ > + Pr [ |J . 



Considering the conditional probability on {Xg}g<r, we have 



Pr 



i^r > R2, n $6 



= E 

< E 



Ft 



i^r>R2, n ^e\{Xe}e<r 



l($,)Pr iJr>R2\{Xe]e<r 



As 



l($^)Pr 



{Xq}q^j. < max max !($£)) Pr 



i^9>R2 



{Xe'}e 



the desired inequality follows. 

Applying the same argument for —T{9), the second part also follows. 



□ 



Proof of Lemma 13.21 As 



n (1-T,)=exp( Ml-T.)), 



j=JV 
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0''N 

we show a high concentration for log(l-T,). Since Pr [3 j, Tj > 1/2] < EiLjv2~^ < 2 
and 

-x-x^ < log(l -x)<-x Vx : < X < 1/2, 
with probabihty at least 1 — 2"^;*^"'"''^, we have 

-Tj - T/ < log(l - Tj) < -Tj, for all j. 
Thus, it is enough to show that both of E[S*] and -^[T^*] are very close to and 

T* := and S* := 



j=N 



j=N 



are highly concentrated. That is, we will show that 



l+o(l) . ,e^ke^N , 

Pr[max|5; - E[S*]\ > e] < « '"^"^ 



and 



together with 



Pr[max|7;* - E[T*]\ > e] < Ae -^'^'''^W^'''i^\ 



E[Sn--rloge^ + o{{^y'), and E[T:] ^ -r log 6, + o{{^y') 



1-^.\1/2n 



The o^^^F^j j terms do not matter, since the desired inequality is trivial unless e — 
If £ = fl{{^gi^y^^), then the above concentration inequalities for 0.95s give 



Pr 



max ^ log(l — Tj) — r log ^. 



j=N 

N-iekR 



> 0.95£ 



which along with £ < 0.1 yields 

0''N 



Pr 



max I TT fl - - 6' 

i:l<i<l\ J-l V V 



N-je^R 



> e 



l+o(l) . rS^kOfN ^ 
< gg-^V^ ™"^{ 2(1-9;) '^^f-^} _^ 2~^f^+^ 



< lOe ^2(1-6,)'^'^;^^/ 



For the concentration inequalities without the maximum, it is enough to check the hy- 
potheses of the generalized Chernoff bound. First, it is routine to check that 

FIT''] - 
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for positive integer h. Thus, for j ^ 1, 



e':N 



Var[Sj\ < E[S'j] < — =: a,, and ^ 



3 3r(l-^f) ^Ak{l-e^) _ ^(l-d 



I 



Furthermore, for < ,^ < ^ '■= O.W^N, 



(4.6) 



E[S'e' 



'3MSi-E[S,])i 



10 



and, for — ^ C < 0, 



Since 



10 

f 



9''N 

^ 10 ^ 6r(l-^f*^) 



(4.7) 



we have that — Yl]=i^i hence, for e > 0, the generahzed Chernoff bound 

(Corollary 14. 2 p gives 



Pr 



and 



As 



Pr 



\s:~E[s:]\>s 
\s* - s: - E[s: ~ sn\ > e 



< 2e" 



.i±£(i)^in{4^,e9fiV} 



1^; - E[s:]\ < \s; - E[s;]\ + - s: - e[s: - sni 

and S* — SI is independent of Si,...,Si, we may apply Lemma 1431 with £i = 52 = £^/2 to 



obtain 



Similarly, 



FT[max\S*-E[S*]\>e]<4e 

i:l<i<l 



Frlrrmx IT* ~ E[T:]\ > s] < Ae 

t:l<i<l 



For the expectations, since 









E 






and 





l + 0(l/j) ^ r 
J + 1 - /c 



^ /I- (9^ 



-r log 9i+o 



1 - 0hi/2 



em 



e'=7v 6»'=Af e^Af 



■ 1 _ 61K 1/2 



N-j&kR 



j = N 
N-jSkR 



9fN 
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we have 



E[Sn = -r\oge, + o[[-^) ), and E[T;] = -r\oge, + o[[-^) ) 
as desired. 

□ 

5 Generalized Core-Processes and Main Lemma 

In this section, we introduce generahzed cores and the main lemma. The main lemma will 
be crucial in the proofs of theorems mentioned in the introduction. 

We start with a few terminology. A generalized degree is an ordered pair (^1,(^2) of 
non-negative integers. The inequality between two generalized degrees is determined by 
the inequality between the first coordinates and the reverse inequality between the second 
coordinates. That is, ((^1,^2) > {d[,d'2) if and only if di > d'l and d2 < (ig. A property 
for generalized degrees is simply a set of generalized degrees. A property P is increasing if 
generalize degrees larger than an element in P are also in P. When a property P depends 
only on the first coordinate of generalized degrees, it is a property for degrees. For the 
t-core problem, we will use Pt-core = {{di, d2) : di > t}. To estimate the size of the largest 
component, we will set Pcomp = {(1^1, (^2) : (^2 = 0}. 

Given the Poisson A-cell on the set V of n vertices and 6 in the range < < 1, let 
d.u{9) be the number of f -clones smaller than ^A. Similarly, d^{6) is the number of f -clones 
larger than or equal to ^^A. Then, D^{9) := {d^{6) , dy{6)) are i.i.d random variables. In 
particular, for any property P, the events D^{9) G P are independent and occur with the 
same probability, say p{0, A; P), or simply p{0). 

For an increasing property P, the P-process is defined as follows. Construct the Poisson 
A-cell as described in Sectional where A = p(^lj). The vertex set V = {v^, ...,v^_^} will 
be regarded as an ordered set so that the i*^ vertex is f. ^. The P-process, or generalized 
core-process generated by P, is a generalization of Example 12.21 for which choice functions 
choose t-light clones. 

The P-process: Initially, the cut-off value A = A. Activate all vertices v with D^{1) ^ P. 
All clones of the activated vertices are activated too. Put activated clones in a stack in an 
arbitrary order. However, this does not mean that the clones are removed from the A-cell. 

(a) If the stack is empty, go to (b). If the stack is nonempty, choose the first clone in the 
stack and move the cut-off line to the left until the largest k — 1 unmatched clones, excluding 
the chosen clone, are found. (So, the cut-off value A keeps decreasing.) Then, match the 
k — 1 clones to the chosen clone. Remove all matched clones from the stack and repeat. A 
vertex v that has not been activated is to be activated as soon as D„(A/A) ^ P. This can 
be done even before all — 1 clones are found. Its unmatched clones are to be activated too 
and put into the stack immediately. Clones found while moving the cut-off line are also in 
the stack until they are matched. 

(b) Activate the first vertex in V that has not been activated. Its clones are activated too. 
Put those clones into the stack. Then, go to (a). 
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Clones in the stack are called active. The steps carried by the instruction described in (b) 
are called free steps as we are free to choose any clone. 

When the cut-off line is at ^A, all ^A-large clones are matched or will be matched at the 
end of the step and all vertices v with D^{9) ^ P have been activated. All other vertices 
can have been activated only by free steps. Let V(9) = Vp{6) be the set of vertices v with 
Dy{6) e P, and let M{6) = Mp{9) be the number of ^^A-large clones plus the number of 
^A-small clones of vertices v not in V{9). That is, 

M{e) = Y,dv{0) + d,{e)i{v ^ v{e)) = + d,{e)i{D,{e) ^ p). (5.1) 



Recalling that A^(^) is the number of matched clones until the cut-off line reaches ^A, 
the number A{d) of active clones (when the cut-off value A is) at 6* A is at least as large as 
M{9)-N{9). On the other hand, the difference A{e) - {M{e)-N{e)) is at most the number 
F{9) of clones activated in free steps until ^A, i.e.. 



M{e) - N{e) < A{e) < M{e) - N{e) + F{e). 

As the cut-off lemma gives a concentration inequality for N(9), 

' max \N{0) - (1 - 0T^^)Xn\ > a1 < 2e"^^""^^'^^^\ 

. 0:9-1 J 



(5.2) 



Pr 



a concentration inequality for M(6) will be enough to obtain a similar inequality for B(9) :— 
M{9) — N{9). More precisely, we will show that, under appropriate hypotheses. 



Pr 



max 
0:e^<9<i 



M{9) - {X-q{9))n 



> A 



where 



q{9) = qi9, X;P)^E d,{9)l{D,{9) e P) 



As dv{9ys and Dy{9ys are identically distributed, q{9) does not depend on v. Recall also 
p{9) = Pt[D,{9) E P]. 

As we will see later, B{9) is very close to A{9). Hence, a concentration inequahty for 
B{9) is crucial. 

Lemma 5.1 (Main lemma) In the P-process, if 9^ < 1 uniformly hounded from below by 0, 
1 — p{9^ — 0{1 — 9^ and p{9^ — Jl(l); then, for all A in the range < A < 



Pr 



max 
6»:e^<e<i 



and 



Pr 



max 

6»:0j<6><l 



\V{9)\ -p{9)n\ > a] < 2e-'^^'"^°^^'^", 
B{9) - (A^^ - q{9))n > a1 < 2e"''^""^^'(^^^ 
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Remark. If A ^ n^/^ log n, the proof of the main lemma is much easier: Without the max, 
the two concentration inequalities follow from the generalized Chernoff bound. Since the 
bounds for the probabilities are much less than 1/n and there are 0{n) meaningful ^'s, the 
first moment method gives the inequalities. This is already enough to prove Theorems II. 4[ 
11.71 and [L9l provided |A — AcrtI ^ n^^^logn. The full strength of the lemma is needed when 
the logn factor is missing. 

For the proof, we first show that a concentration inequality for 



\v{e)\ = Y,HDMeP), 



which is a sum of i.i.d Bernoulli random variables with mean p{6). More generally, we have 

Lemma 5.2 Suppose Xi's are i.i.d Bernoulli random variables with mean p. Then, for 
A > 0, 



pm 



i=l 



> A 



Proof. Since 



E[X,]=p, E[{X,-pf]=p{l~p), 
and, for p with |^| < := log 2, 

-p)3e«(i(^«(^)=°)-P)]| < 2p{l-p), 

we may set = p{l — p) and bi = 2p{l — p). Applying the generalized Chernoff bound, we 
have the desired inequahty. 

□ 

Corollary 5.3 For 9 in the range 9^< 9 < 1 and with the same hypotheses as in the main 
lemma, 



Pr 



> A 



\Vp{9)\-p{9)n 

As in Example 14.41 Lemma [4.51 yields a concentration inequality for all of V{9ys: 
Lemma 5.4 With the same hypotheses as in the main lemma. 



Pr 



max 

9:9^<9<l 



> A 



< 2e 



_n(min{A,(j^}) 



piO) 



\V{9)\-p{9)n 

Proof. Observing that, for 9-^ < 9 < 1, 

\V{9)\-p{9)n\ < \\V{9,)\-p{9-,)n\ + \\V{9,)\ - f^l^l 
we set r(^) = \ \V{9)\ - p{9)n\, 
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Clearly, r(^) < ^ + 'ipe- Corollary 15.31 gives 

Pr[^>A/2]<2e~''^"'''^^^'^^^ 



(5.3) 



Suppose {Xqi := V{6')}0<0i<i is given, especially V{6) is given. Then, since P is increasing, 
we may write |^(6'J| as 

im)i= E ip.(^i)ep), 

with 



Vi[D,{e,) e P\{Xo'}e'<e] = Pr[/^.(^^J G P\v E ¥(9)] 



Lemma [5.21 then gives 



(5.4)' 

Lemma together with (15. 3p and (15.41) yields the desired inequality. 



□ 



We now estimate M{6). First, since YIvgv dv{0) is a Poisson random variable with mean 
Pr [ \ ^d,{e) - (1 - e)\n > A/2j < 2e^°''"^^'(T^^ (5.5) 



For the second sum in (15. ip . observe that 



is a sum of i.i.d random variables with 

E[d,{9)i{D,{e) ^ p)] = E[d„{e) - d,{e)i{Dm e p)] = ^^a - q{9). 

Moreover, since P is an increasing property and dy{6) is a Poisson 6X random variable, FKG 
inequahty (see e.g. Chapter 6 of [6]) gives 

EiidMUMO) ^ P)y] < E[dMl Pr[A(^) ^P] = 0(1 - p{e)), 

for all fixed i, e.g. i = 1,2, 3. Thus one may take = 1 and aj, bi = 0(1 — p{0)) to satisfy 
all the conditions to apply the generalized Chernoff bound and to obtain 

Pr [ I E d,{e)l{v ^ V{e)) - {ex - q{e))n > a] < 2e~^('"'°^'^'a^tJ)F». 



This together with (15.50 implies that if 1 — p{6) = 0(1 — 6), then 



Pr 



M{9) -{X-q{9))n 



> A 



< 



2g-n(min{A,^})_ 



(5.6) 



As mentioned, the following lemma is enough to prove the main lemma. 
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Lemma 5.5 With the same hypotheses as in the main lemma, 



Ft 



max 

l:a<6»<l 



M{9) -{\^q{9))n 



> A 



< 2e 



-C(mm{A,^T^}) 



Proof. Clearly, 

\M{e) - (A - qie))n\ < \M{e,) - (A - qie,))n\ + \M{e,) - M{e) - {q{e) - q{e,))n\ 

Let r(^) = \M{e) - (A - q{e))n\, 

^ = Ti9J, = \M{ej - M{e) - {q{e) - q{e;))ni 
and $e is the event \V{6) — p{6)n\ < ■• Then, f l5.6p gives 

Pr[^ > A/2] < 26'^""^' (A)^". 
For ^e, suppose {Xq, := {V{9'), M{9'))}e<g'<i is given. Using 

M{e) = J2dv{0) + d,{9)l{v ^ V{9)) = 5^4(1) - d^liv e V{9)), 



we obtain 



Mi9,)-M{9) = Y,dMHveV{9))-dMl{veV{9,)). 



vev 



Once V{9) is given, the distributions of 

n := d,i9)liv e Vi9)) - d,.{9,)l{^ e Vi9J) 
depend on neither {M{9')}g^gi^i nor {V{9')}g^gi^i and hence, for v G V{9), 



E 



e<e'<i 



E 



d,{9)-d,{9,)l{veV{9,))veV{9) 



q{9) - qi9,) 



p{9) 



If V ^ V{9), Yi = since P is increasing. 
Also, for V G V{9), we may write 



and 



E 



Y, = d,{9) - d,{9,) + d,{9,)l{v ^ V{9,)) 



V G V{9) < 2E {d^{9) - d^{9^)f v G ¥(9) + 2E d^{9jH{v ^ ¥(9,)) v G V{9) 



First, for j = 1, 2, 3, 



E 



{dM-dMy vev{9) <p{9)-'E {dM-dM)y = 0{9 - 9,) = 0{l - 9, 



32 



for p{d) > p{6-i) = and d^{6) — d^{9^ is a Poisson random variable with mean 
0{9 — 9-^. For the second term, FKG inequahty gives 

E[[dMl{v^V{9,))y\veV{9)\ < p{9)-^E[dM'Kv^V{9,)) 

< p{9y^E[d,{9y]Vi[v^Vm 
= 0{l-p{9,)) = 0{l-9,), 

for j = 1,2,3. Therefore, 



E 



Y,-E[Yi\) {Xg,}e<e'<i 



< E 



Y/ 



{Xe'} 



e<e'<i 



0(1-9: 



Similarly, for ^ in the range |^| < = 1, it is not hard to show 



E 



(Y, - E[r,])3e«(^-^[^']) {Xe'}e<e'<i] = 0(1 - 9, 



Applying the generalized Chernoff bound, we have 

qi0)-qi9. 



Pr 



[IE 



Y, 



p{9) 



\V{9)\ >A/A{Xe>]e<e 



'<i 



Finally, as the event $0 guarantees 

q{9) - q{9,) 



p{9) 



\V{9)\-p{9)n < A/4 



for < p{9) and q{9) < A, we have 

l(<l>e)Pr [ I - {q{9) - q{9,))n > A/2 {Xe'}e<9'<i 

pi9) 



< Pr [ I - ^^^tl^\Vi9)\ > A/4 {Xe>}e<e'<i 

-n(mm{A,3^}) 



<2e ' ■^-'i- _ 
Lemma yields the desired inequality. 



□ 



6 Cores of Random Hypergraphs 

This section is for the proof of Theorem ll.7[ Let A > and H{X) = Hpc{n,p), where 
A = p{^Zi)- Let the property P = {{di, d2) : di > t} . Then 

p{9) = Q{9\ t), and q{9) = 9XQ{9X, t-1). 



The main lemma gives 
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Corollary 6.1 For 6^ < 1 uniformly hounded from below by and A in the range < A < n, 



Pr 



max 



> A 



and 



Pr 



max 



\v{e)\-Q{e\t)n 
B[e) - - Q{ex,t- i))exn 



Subcritical Region: For A = Acrt — c, a ^ n and 6*-^ = 5/Acrt with 5 = 0.1, it is easy to 
see that there is a constant c > such that 

{6^ - Q{eX, t - l))eXn > can, for all 6 in the range 6*, < ^ < 1. 

Let T be the first time the number A{6) of active clones at 6X becomes 0. Then the second 
part of Corollary 16.11 gives 



Pr[r > < Fr[B{9) = for some 9 with 9^ < 9 < 1] 



max 



B{9) 



Q{9X,t - l))9Xn 



> can 



< Pr 

As 6'^A < 6'-^Acrt = 5, and hence Q{9^X,t) < 5/2 for t > 2, the first part of Corollary 16.11 yields 

Vi[\Vt{Hpc{n,p]k))\ > 6n] < Pr[r > 9,] +Ft[\V{9,)\ > 6n] < 2e"^("'"^ 

Therefore, Theorem 11.11 implies that 

PT[\Vt{H{n,p;k))\ > 6n] < 2e-^("'"\ 

To complete the proof, we observe that the t-core of size i has at least ti/k edges. Let 
Zi be the number of subgraphs on i vertices with at least ti/k edges, i = i^, ...,6n, where 
ig = io{k,t) is the least i such that (^) > ti/k. Then, in H{n,p; k), 



i\ {ti/k)\ 

where ti/k actually means Iti/k]. Observe that 

,k -kt ^ ' ' i \ {k~l)t-k 

-nJ 



That is, Lj+fc/Lj exponentially decreases. Since 

for z = ig, + /c — 1, it follows that 

Vi[Vt{H{n,p-k)) ^ 0] < 2e-^('^'") + 0(n-'o(*-i-*/'=)), 

as desired. 



□ 



Supercritical Region: We will prove the following theorem. 
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Theorem 6.2 If X : = p{^J[) = Acrt + cr with a ^ n ^1"^ and < (5 < 1, then, with probability 
1 _ 2e-^('"W'5''^n,'^'n})^ Vt = Vt{Hpc{n,p;k)) satisfies 

Q{e^X, t)n-6n<\Vt\< Q{e^X, t)n + 5n, 

and the degrees of vertices of the t-core are i.i.d t-truncated Poisson random variables with 
parameter At := 6^X + (3 for some (3 with \(3\ < 5. Moreover, the distribution of the t-core is 
the same as that of the t-truncated Poisson cloning model with parameters \Vt\ and Kf 

Recall that ^ is the largest solution for the equation 

- Q{ex,t - I) = 0. 

Proof. First, it is not hard to check that there are constant q, q > such that, for 6 in the 
range 9^ < 9 < 1, 

9^ - Q{9X,t- 1) > c^a^/\9 - 9J, 
and, for 9 in the range ^ — qo"^/^ ^ 9 < 9^, 

9^ - Q{9X,t~ 1) < -qa^/2(e, - 9). 

Let r be the largest 9 with A{9) = 0. Then V{t) is the t-core of Hpc{n,p; k). For 
e^ = 9^-^6 and 9^=9^- mm{6, c^cr^^'^} with < 5 < 1, Corollary O gives 

Pr[r > < Pt[B{9) = for some 9 with 9^ < 9 < 1] 



max 

0,<e<i 



< Pr 



B{9)-{9T^ -Q{9X,t-l))9Xn > qa^^^Sn 



and 



Pr[r < 9^] < Fr[B{9^) > 0] 

< Pr [ B{9,) - {9^' - Q{9^X, t - l))9^Xn > c^a^l^ min{(5, qa^/^jn 

^ c2^^—S^(jaYa{fP' an ,0^ nY) 

Since ^Q{9X,t) = XP{9X,t - 1) < A, we have 

Q{9,X,t) <Q{9^X,t) + X6, and Q{9,X,t) > Q{9^X,t) - X6, 
and Corollary 16.11 implies that 

Pt[V{9,) - Q{9^X,t)n > 2X6n] < 2e-^(^'"), 

and 

Pr[Vi9,) - Qi9^X,t)n < -2XSn] < 2e-^^^^'^\ 
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Therefore, 



Pr[|r -9J>5]< Pr[r > 0J + Pr[r < 6*,] < 2e-^("^''^^^'""'"'"», 



and, replacing 5 by ^ 



2A' 



Pr[|y(r) - Q{9^\,t)n\ > 5n] < Pr[r > 0J + Pr[r < 9,] + 2e-^('^'") 

Clearly, once V{r) and := rA are given, the residual degrees dy^r), v G V{t), are i.i.d 
t-truncated Poisson random variables with parameter At. □ 

Once Vf and are given, |Vt(i)|, i > t, is the sum of i.i.d Bernoulli random variables 
with mean p^-^t) '■= n^A^'^t) ' Similarly, the size of Wt{i) = Uj>jVf(j) is the sum of i.i.d 



Bernoulli random variables with mean qX-^t) 
bound (Lemma 14. 2p . we have 



Q(At,») 
Q(At,t) 



. Applying the generalized Chernoff 



and 



Pr 
Pr 



\VS)\-plAt)\Vt\ >6\Vt\ 



Vt,At 
Vt,At 



< 2e 



\WM - qlAt)\Vt\ >6\Vt\ 
Combining these with Lemma [6.21 and using 

|P(p,z)-P(p',z)| < |p-p'|, and \Q{p,t)-Q{p\t)\<\p-p'\ 
we obtain, for any i, 



and 



Pr 



Pr 



\Vt{^\-P{9^\,t)n 



\Wt{7^\-Q{9,X,i)n 



> Sn 



> 6n 



In particular, as ^ = 9cTt + 6(o"^/^), for uniformly bounded a, it follows that, for A = 

Acrt + CT, 

\Vt{t)\ = {l + 0{a'/^)yP{9,^X,,u^)n + o(^{n/aY/Hogr?j, 

with probability 1 - 2e^^(™'^^'°s' 

The last part of Theorem 11.71 does not follow from Theorem 16.21 as H{n,p;k) and 
Hpc{n,p; k) do not have the same distribution. We may directly prove it instead. 

For a hypergraph H, let H be the hypergraph obtained from H by removing all edges 
in the t-core. Then, H has no edge that is entirely in Vt{H), otherwise, the t-core becomes 
larger. Thus, the union of H and any simple hypergraph on Vt{H) is also simple. Therefore, 
conditioned on H = H{n,p k), two hypergraphs Hi and H2 on Vt{H{n,p ; k)) with the same 
degree sequence are equally likely to be the t-core of H{n,p ; k): Notice that 

PT[H{n,p;k) = HU Hi] = (1 - , 

and 



Pr[H{n,p;k) = H U H2] 



p 



rh+m,, 



2(1 _p)UJ-'"-™2^ 



where m, and are the numbers of edges in i/. Hi and H2, respectively. Clearly, 
= as the degree sequences of Hi and H2 are the same. 
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7 The pure literal algorithm for the random /c-SAT 



To analyze the pure literal algorithm for a random A;-SAT, /c > 3, it is necessary to consider a 
pair of degrees. We consider this in a generalized framework. Starting with some terminology, 
a sequence of generalized degrees is larger than or equal to another with the same length if 
so is each pair of corresponding generalized degrees. A property for sequences of generalized 
degrees is a set of generalized degree sequences. A property P is increasing if sequences of 
generalized degree larger than an element in P are also in P. 

Given a Poisson A-cell on the set ^ of n vertices, let C = {Ci} be an equipartition of the 
vertex set V. Then, Di{6) := {dy{6),dy{6))^(=c\ are i.i.d random variables. In particular, for 
any property P, the events Di[9) e P are independent and occur with the same probability, 
say p{9, A; P, C), or simply p{9). For the pure literal algorithm of the random fc-SAT problem, 
the property is the set of pairs of generalized degrees (rfi, ^2), ((i'l, d'^ with di, d'^ > 1. 

For an increasing property P and an equipartition C = {Cj}i=i_...^m, the (P, C)-process is 
defined as follows. Construct the Poisson A-cell as described in Section [3l where A = p(^I^). 
The (P, C)-process is a generalization of the P-process. 

The (P, C)-process: Initially, the cut-off value A = A. Activate all vertices f G Cj with 
-01(1) ^ P. All clones of the activated vertices are activated too. Put activated clones in a 
stack in an arbitrary order. However, this does not mean that the clones are removed from 
the A-cell. 

(a) If the stack is empty, go to (b). If the stack is nonempty, choose the first clone in the 
stack and move the cut-off line to the left until the largest k — 1 unmatched clones, excluding 
the chosen clone, are found. (So, the cut-off value A keeps decreasing.) Then, match the 
k — 1 clones to the chosen clone. Remove all matched clones from the stack and repeat. A 
vertex in Ci that has not been activated is to be activated as soon as Di{A/X) ^ P. This 
can be done even before all — 1 clones are found. Its unmatched clones are to be activated 
too and put into the stack immediately. Clones found while moving the cut-off line are also 
in the stack until they are matched. 

(b) Activate all vertices in the first Ci no vertex of which has not been activated. Its clones 
are activated too. Put those clones into the stack. Then, go to (a). 

Clones in the stack are called active. The steps carried by the instruction described in (b) 
are called free steps as it is free to artificially activate a vertex. 

When the cut-off line is at ^A, all ^A-large clones are matched or will be matched at the 
end of the step and all vertices in Ci with Di{6) ^ P are activated. All other vertices can be 
activated only by free steps. Let V{9) = V(p_c)(6') be the union of with Di{9) G P, and 
let M{9) = M(^pfi^{9) be the number of 6' A- large clones plus the number of 6'A-small clones 
oiv^V{9). That is, 

m 

M{9) = J2 dv{0) + d,{9)l{v ^ Vi9)) = Y,Y1 ^-(^) + d,{9)l{D,{9) ^ P). 
vev i=i veCi 

Recalling that A^(^) is the number of matched clones until the cut-off line reaches ^A, the 
number A{9) of active clones (when the cut-off A is) at ^A is at least as large as M{9) —N{9). 
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On the other hand, the difference A{9) — {M{9)—N{9)) is at most the number F{9) of clones 
activated in free steps until ^A, i.e., 

M{9) - N{9) < A{9) < M{9) - N{9) + F{9). 

As the cut-off lemma gives a concentration inequality for N{9), 



Pr 



max \N{9) - (1 - e^)An| > a1 < 2e""^""^^'^^", 



a concentration inequality for M(9) will be enough to obtain a similar inequality for B(9) :— 
M{9) — N{9). More precisely, we will show that, under appropriate hypotheses. 



Pr 



max 



M{9) - {\-q{9))n 



< A 



< 



2^-n(min{A,^^})^ 



where 



q(9) = q(9, A; P, C) = E [(-^ J] 1( A(^) £ P) 



As Di{9) arc identically distributed, q{6) does not depend on i. Recall alsop(^^) = p{6, A; P, C) - 
PT[Di{9) G P]. Here is a generalization of the main lemma. Its proof is quite similar to that 
of the main lemma and it is presented in the Appendix. 

Lemma 7.1 (Main lemma: generalized version) In the {P,C) -process described above, if 
< 1 uniformly bounded from below by 0, and \Ci\ = 0(1), 1 — p{9^ = 0(1 — 9^ and 
p{9-^) = then, for all A in the range < A < n, 



Pr 



max 
6»:ft<e<i 



\Vi9)\ -p{9)n\ < a] < 2e-''^'"^"^^'(4)^", 



and 



Pr 



max 

0:01<O<1 



B{9) - {X9—^ - q{9))n 



< A 



< 2e 



-Q(min{A,^^}) 



Let A > and F{\) = Fpc{n,p;k), where A = p{^J!_i)- As mentioned, we take the 
property P = {((^1,^2), ('^'ij'^2)) • di,d[ > 1} and Ci = {xi,Xi}. Then 



p{9)^Q\9X, 1) 



:i-e 



-e\\2 



, and g(^) = ^A(l-e 



-e\\ 



Let X{9) be the set of variables x with both of dx{9) and dx{9) larger than 0. Then the main 
lemma and \V{9)\ = 2\X{9)\ give 

Corollciry 7.2 For 9^ < 1 uniformly bounded from below by and A in the range < A < 



Pr 



max 

0:e^<9<l 



\X{9)\-{l-e 



-e\\2 



and 



Pr 



max 



n 



B{e) - 2(^^ - (1 - e-^^))e\n 



> A 



>A < 2e-"('"^°{^'^». 
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As 1 — e = Q[6X,t — 1) with t = 2, a similar argument used in the previous section 
may be apphed to prove Theorem ll.9[ 

Subcritical Region: For A = Ad-t — o", cr ^ n"^/^, let 9-^ = S/Xat with 5 = 0.1, and let r be 
the first time the number ^4(6*) of active clones at 6X becomes 0. Since 

2(0kh - (1 - e^^))eXn > can, for all 6 in the range 6, < 9 < 1, 
for a constant c > 0, the second part of Corollary 17.21 gives 



Pr[r > = Pt[B{9) = for some 9 with 9^ < 9 < 1] 



max 

9:fl<0<l 



B{9) - 2(^^ - (1 - e-'^^))9Xn 



> can 



< Pr 

As (1 — e^^i'^)^ < 6"^, this and the first part of Corollary 16.11 yield 

PT[\Xii{Fpc{n,p;k))\ > 6n] < Pr[r > 0J + Pr[|X(^J| > 6n] < 2e~^("''^\ 

Therefore, Theorem 11.81 implies that 

PT[\XR{F{n,p;k))\ > 6n] < 2e-^("'"\ 

To complete the proof, we observe that the residual formula on i variables has at least 
2i/k clauses. Let Zi be number of subformulas on i variables with at least 2i/k clauses, 
,5n, where = io{k) is the least i such that 2^(*) > 2i/k. Then, in F(n,p; k), 



where 2i/k actually means \2i / k] . Observe that 

That is, Lij^k/Li exponentially decreases for /c > 3. Since 

Li = 0{n'n-^'^^-^^'^) = 0(n-'(i-2A))^ 

for i = ig, ...,^0 + k — 1, we have 

Pi[\XR{F{n,p;k)) \ 7^ 0] < 2e-^("'") + 0(n"'^o(i-2A))^ 

as desired. 



□ 



Supercritical Region: Applying the same argument used to prove Theorem [62] in the previous 
section, we may easily obtain 
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Theorem 7.3 If X = p{^J!^_^) > Acrt+o" with a ^ n ^1'^ andO < 5 < 1, then, with probability 

I _ 2e-^imin{S^an,a^n}) ^ \X ji{n, p ] k)\ SttttsfieS 

(1 - e~^^fn -6n< |X^(n, p;k)\<{l- e-^^fn + 6n, 

and the degrees of literals of Xji{n,p] k) are i.i.d 1-truncated Poisson random variables with 
parameter := + (3 for some f3 with < 6. Moreover, the distribution of the residual 
formula on XR{n,p ; k) is the same as that of the 1-truncated Poisson cloning model with 
parameters \X{i{n,p ; k)\ and A/j. 

Proof. The proof is almost identical to that of Theorem 16.21 with t = 2. □ 

Once Xji := Xji{n,p; k) and A/j are given, |X/j(i, j)|, i,j > 1, is the sum of i.i.d Bernoulli 
random variables with mean p- . (A^) := ^^II^^^^^-k^^^'^ ■ Applying Lemma the concentration 
inequality for a sum of i.i.d Bernoulli random variables, we now have 



Pr 



\Xn{t,j)\-pJA,)\XR\ >5\Xn 
Combining this with Lemma (16. 2p and using 

|P(p,2)-P(p',0|<|p-p' 

we obtain, for any i, 



< 2^~n{s''\XR\)_ 



Similarly, 



Pr 



Pr 



\Xn{z,j)\-P{9,X,z)P{9,\,j)n 



\yRihj)\ 



> 6n 



> 6n 



The last statement of Theorem 11.91 follows by the same argument used in the previous 
section. 



8 The Emergence Of the Giant Component 

In this section, we prove Theorem II. 4[ Let the property P = {{di, ^2) : = 0}. Then 

p{9) = e-(^-')^ and q{9) = 9Xe-^^-'^^, 
and the main lemma gives 

Corollary 8.1 For 9^ < 1 uniformly bounded from below by and A in the range < A < n. 



Pr 



max 

9:ft<6»<l 



\V{9)\-e-^^-^^^n 



> A 



< 2e 



-n(min{A,^^}) 



and 



Pr 



max 

9:ft<6»<l 



B{9) 



— e 



-(i-e)A 



)9Xn 



> A 



< 2e 



-f7(min{A,(^}) 
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To estimate A{9), it is now enough to estimate F{9) by (15.21) . Once good estimations 
for F{6) are established, we may take similar (but slightly more complicated) approaches 
used in Section O We consider an (imaginary) secondary stack with parameter p, or simply 
p-secondary stack. Initially, the secondary stack with parameter p consists of the first pn 
vertices t>g, f^^ -^ of V. The set of those pn vertices is denoted by Vp. Whenever the 
primary stack is empty, the first vertex in the secondary stack that has not been activated is 
to be activated. Its clones are activated too and put into the primary stack. The activated 
vertex as well as vertices activated by other means is no longer in the secondary stack. If 
the secondary stack is empty, go back to the regular procedure. This does not change the 
P-process at all, but will be used just for the analysis. Let Tp be the largest r such that, at 
tX, the primary stack becomes empty after the secondary stack is empty. Thus, once the 
cut-off line reaches r^A, no active clones are provided from the secondary stack. Denoted by 
C{p) is the union of the components containing any vertex in Vp. 

The following lemma is useful to predict how large Tp is. 

Lemma 8.2 Suppose < 6, p < 1 and 9^,9^ < 1 are uniformly hounded from below by 0. 
Then 

Pr[r^ > 9,] < Pr[ min B{9) < -(1 - 6)9,Xe-^^-'^^^^ pn] + 2e~"(^'^"\ 
and conversely, 

Pr[r^ <9,]<Ft[B{9,)>-{1 + 5)9,\e~^'-'^^^^pn] + 2e-^(^'''"\ 

Proof. For simplicity we will write r and W for Tp and Vp, respectively. Since, at rA, the 
primary stack is empty for the first time after no vertex is left in the secondary stack, C{p) 
is exactly WUV{t). And, all clones of vertices in WUV{t) must have been matched. Thus, 

M(r) + N{W) > iV(r), or equivalently B{t) > -N{W), 

where, in general, N{V') is the number of clones of f G V. The inequality can be strict 
when there are rA-large clones of v & W. However, clones of a vertex v & W that has no 
rA-large clone are not counted in M(r). Thus, 

M(r) + N{W{t)) = N{t), or equivalently B{t) = -N{W{t)), 

where W{9) is the set of vertices in W that have no ^A-large clone. Thus, r is the unique 9 
such that B{9) = -N{W{9)) and B{9') > -N{W{9')) for all 9' > 9. 

If r > 9^, then B{9) = -N{W{9)) for some 9 with < ^ < 1. As W{9,) C W{9) for such 
9, we have 

B{9) < -N{W{9^)), for some 9 in the range 9^ < 9 < 1. 

This implies that 

Pr[r > 92 < Pt[N{W{9,)) < (1 -(5)6',Ae-(^-^i) V] +Pr[ min B{9) < -{l-5)9,Xe-^^-''^^^pn]. 
For 9, in general, 

N{W{9)) = J2d.m{d,{0) = 0) 
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is a sum of i.i.d. random variables with mean 6\e it is easy to check by the generahzed 

Chernoff bound that 

Pr[|iV(l^(^)) - eXe-^^-^^^pn\ > 69Xe-^^-^^^pn] < 2e~^^^'p''l (8.1) 

Conversely, ii t < 9^, then 3(6^) > -N{W{e^)), which together with (EI]) yields 

Pr[r < e^] < PT[N{W{e,)) > (1 + 6)e^Xe-'-^^^'^^^pn] + Pr[5(ej > -(1 + 6)e,Xe-^^-^'^^^pn] 
< Pr[5(6'J > -(1 + 6)e,Xe-^^-^^^^pn] + 2e-^(^'^"). 

□ 

Since we are interested in 6 close to 1, it is convenient to define A{6) = A{1 — 9), 
V{e) = V{1 - 6), B{e) = B{1 - e), F{e) = F{1 - e) and r = 1 - r . Then Corollary O 
and Lemma [8.21 can be written as 

Corollary 8.3 For 9^ > uniformly bounded from above by 1 and A in the range < A < n, 



Pr 



max 

9:0<e<e^ 



\V{e)\-e~''^n 



> A 



< 2e 



-n{min{A,^}) 



and, for b{e) = {1 - e){l - 6 - e-^^)Xn, 



Pr 



max 

9:0<6»<ft 



B{9) - b{9) 



> A 



< 2e 



-n{mm{A,|;j}) 



Lemma 8.4 Suppose < 6, p < 1 and 9^,9^ > are uniformly bounded from above by 1. 
Then 

Pr[r, < 9,] < PrL min B{9) < -(1 - 5)(1 - ^^JAe-^^V] + 2e-^(^'^"), 
and conversely, 

Pr[f^ > 9,^] < Pt[B{9^) > -(1 + 6){1- 9^)Xe'^^^pn] + 2e'^^^^P''\ 
In particular, if < 9^, 9^ <^ 6, then 



Pr[f, < 9,] < PrL min B{9) < -(1 - 5)Xpn] + 2e 

c':U^c7<,c7-. 



and 



Pr[f > 6*,] < Pr[5(6'J > -(1 + 6)Xpn] + 2e-^(^'^"). 



Proof of Theorem 11.51 

Supercritical Region: Suppose X = 1 + e with e ^ n^^^^ and 1 -C a <^ (e^n)^/^. Three 

9= 9 - a(9ny^/'^ 



phases are to be considered based upon on the values of 9. Let 9^ 
and 9,^ = 9^ + a{9^n)~^^'^. (Recall ^ is the larger solution of the equation 1 — 9 — e^^^ = 0.) 
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To bound the size of the largest component, it is enough to show that all of the following 
events occur with probability 1 — e~^*^" \ 

(i) For p = e^e^ = a^{e^n)-\ we have fp > 6^, especially F(6IJ < N{Vp). 

(ii) For 6 in the range 0-^< 6 < 9^, all B{6) are positive. 

(iii) For some 6 between 6^ and 6,^, ^{0) = 0. 

Once (i) and (ii) occur, as ^4(6') > B{9), the vertices activated between (1 — 9JX and 
(1 — fi,)-^ the same component, say Ci. Excluding the vertices in Vp, all vertices that 

have a (1 — 6'2)A-large clone but no (1 — 6'JA-large clone belong to Ci. That is, V{9-^) \ V{9^) C 
Ci U Vp. Corollary ESI for A = a{9^nY/y2 gives 

Pr[\V{9J \ V{9^)\ < e-^^\l - e'^^-^'^^^^)n - a{9^ny/^] < e'^^"'). 

As 9^ < < 6, < 1, < and 9^=9^- a{9^ny^l'^, we have 

\C,\ > \V{9J \ m)\ - \Vp\ > (1 - e-^> - 0{a{n/9,y/') = 9,n - 0{a{n/9J'/% 

with probability 1 — e~^^°'^\ 

Since (i) and (ii) imply fp > 9^, if (iii) occurs in addition, then Ci C (V \ V{9,^)) U Vp. 
Corollary [8I3] then yields \V \ Vi9^)\ < (1 - e-^3^)n + a{9^nY/^ with probability 1 - e~^(°'). 
As 

pn < a{9^ny/^, and 1 - e'^a^ = 1 _ e~^^ + 0{a{9^n)'^/^) = 9^+ 0{a{9^n)'^/^), 

Ci is of size at most 9^n + 0{{n/9^Y/'^) with the desired probability. Replacing a by 5a for 
an appropriate constant 5 > 0, the bounds for \Ci\ follows as desired in Theorem II. 4[ 

The union S of components constructed before Ci is a subset of (V^\V^(6'J)UV^. Corollary 
18.31 with A = Q.19^n and 1 — < x give 

\V\V{9;)\< 9^\n + 0.161,72 < l.l9^\n, 

with probability 1 - 2e~^(^i") > 1 - 2e~^(°'). Thus, 

|5| <l.ie> + pn< ^ = e(^), 

A 

with probability 1 — Q-^i'^^) , 

Clearly, the vertex set Vr of the residual graph without Ci and S satisfies 

V{9,)\Vp<ZVn<ZV{9,), 

and hence 

(1 -9,)n- a{nlef'^ <\Vr\<{1- 9,)n + a{n/eY/\ 

by replacing a by O.lo; if necessary. Furthermore, the cut-off value A* when the construction 
of Ci is concluded is between (1 — 9.^)\ and (1 — 9^)\, as desired. (Recall, ^ = (2 + 0{e))£.) 
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For the proofs of (i), (ii) and (iii), we observe that there is a positive constant c < 1 such 
that 

b{e) = (1 - e){i -e- e^^^) > ce{e^ - e)n, (8.2) 

for < ^ < and 

m = {i-e)ii-e- e-'') < -ce,ie - ^j, (8.3) 

for 6* > ^ with 6 uniformly bounded from above by 1. 

Proof of (i) Since 6*, < 6, < 1, p = 9^9^ and b{9) > for all 9 < 9^, Lemma El for 5 = 0.5 
gives 

Pr[fp<^J < Pr[ min B (9) < -0.5 pXn] + 2e'^^P''^ 

6:0^6^6-^ 



And Corollary 18.31 yields 



< Pr[ min B(9) - b(9) < -O.bpXn] + 2e-^^°'^\ 
e:o<e<e^ 



Pr[ min B{9) - b{9) < -0.5pXn] < 2e ^^V- ^ < 2e~^("') 



e:o<e<e^ 



□ 

Proof of (ii) If 9 is in the range 9* := a{9^n)~^^'^ < 9 <9^, then 

b{9) > c9{9^ - 9)\n > 0.9ca{9^n)-^/'^9^Xn = 0.9caA(^,r^)^/^ 
(see (ED). Thus, Corollary yields 

Pr[ min B(9) < 01 < Pr[ min 13(9) - b(9) < -0.9caX(9,nY^'^] < 2e-^("'\ 

For 9 between 9^ and 9* , let 9i = i9^, i = 1, for the least integer with z^^^ > 9-^*. Then, 
since b{9) > h{9i) > (1 + o{l))c9i9^Xn for 9- < 9 < 9^^^, applying Corollary 18. 3[ we have 

Pr[ min B{9)<0\ < Pr[ min B{9) -b{9) < -0.9c9i9^Xn] 

e:ei<e<0i+i e:e,<e<e,+i 

< 2e-^(''''» =2e-^(*"'). 

Thus, 



Pr[ min B(9) < 0] < 26'^^'"'^ = 2e 



_ Q{a2) 
1=1 



□ 



Proof of (iii) If (iii) does not occur, then (ii) implies that -^(^^3) = F{9^). Also, 

< ^(^3) < ^(^3) + ^(^3) = ^(^3) + m) 
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gives B{e^) > ~F{9J. By (i), this yields B{9,J > -N{Vp). As N{Vp) is a Poisson \pn 
random variable, 

Pr[A^(V;) > l.lApn] < 2e"^(''") < 2e-^("'), 

and hence 

Pr[B{e.^) > -N{Vp)] < e-^("') + Pr[5(^3) > -l.lApn]. 
As b{e.J < -ca{e^n)-^/^e^Xn = -caX{e^ny/^ (see ([HJD), pn = < a{0^nY'^, and 



a/2 



-6(^3) - l.lXpn > Q.Qca\{e^ny 
Corollary 18.31 gives 

Pr[5(^3) > -l.lApn] < Pr[E(^3) -6(^3) > Q.Qca\{e^nY'^] < e-^^"'). 



□ 



Subcritical Region: (Upper Bound) Suppose \ = 1 — e with e ^ n~^^^. We take p = 
I log(e^n). Until the secondary stack with p = j log(e^n) becomes empty, each free step 
can be regarded as the start of a branching process in which the number of children is the 
number of newly activated clones. The branching process ends just before the next free step. 
We call the i**^ branching process for the branching process initiated by the i*^ free step. 
The whole process ends when no vertex is left in the secondary stack at the conclusion of a 
branching process. As the numbers of newly activated clones are stochastically bounded by 
independent Poisson (1 — random variables, we know that the i**^ branching process dies 
out, say with Di descendants, for all i. Let D{\ — e) be the number of descendants for the 
branching process with (independent) Poisson (1 — e) children. Then 

Pr[A >k\< Pr[D(l ~e)>k\ = e' ^ ^l_^g(e+iog(i-£))(£-i)^ 

As at most Di clones are involved in the i"^ branching process, the size of the component 
containing the vertex activated by the i*^ free step is at most Di. Observing there are at 
most pn possible i, we have the following lemma. 

Lemma 8.5 Suppose \ = \ — e with e ^ n^^l"^ . Then, for the secondary stack Sp with 
p = I \o<g{c'n), 



Pr[3t;G5„ \C,\>k\<-^— „ 

log(£'^n) V. 



In particular, 

We will now show that the cut-off line decreases fast enough. 



45 



Lemma 8.6 Suppose e < 0.01 and p = ae^ with a ^ 1. Then 

Pr[(l - fp)A > 1 - (1 + f)£] < 2e-^(""'"\ 
Proof. For S = 0.01 and 6^ = 0.7ae, Lemma [8.41 gives 



Pr[f, < 9,] < PrL min 3(6) < -{1 - 6){1 - e,)Xe-'o'pn] + 2e-^(^^''"). 

Using 

(1 - 5){1 - ^JAe-^o^pn > O.Oa^^An, 

and 

Kd) > K^o) > -(1 - ^o)Mo + ^o/2)An > -O.Sae^Xn 
for < ^ < we have, by Corolallry 18.3^ that 

Pr[ min < -(1 - 5)(1 - ^JAe^^o V] < Pr[ min B{9) < -0.9 ae^Xn] 

< Pr[ min B{9) - b{9) < -O.lae^An] 

9'.0^9^9q 

Since fp > 9^ imphes that 

(l-f,)A<(l-0J(l-£)<l-(l + fK 

the desired bound follows. □ 

Applying Lemmas 18.51 and 18.61 iteratively for a <^ 1 until the cut-off value is less than 
0.99, we have 

logie'^n) \ il / 

where e^^ = e and e. > (1 + f)£i_i, especially > (1 + f)*£: > (1 + f )£• For any 5 > 0, 
taking 

1 log(£:^n) — 2.5 loglog(£:^n) + c 



we have 
Using 

we finally have 



/ 'x \ and /c / 1 / \ \ 

log(£:'^?T,) — (e + log(l — £)) 

ni-\p-i /p(£,+log{l-£,))A: 



+ log(l - ej < (1 + f )(£ + log(l - £)), 



,„ „(£+log(l— e))fc ^ ^3^ 

Pr[3i;, lai > A;] = of ,,,, , , J + 2e~''^^^^^ + o(ne-^(i°s("'"+^'°siog(e^n))/e2)) 



/;;3/2 log(£:'^?7,) 
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and, for £ ^ 1, 

3 

Pr[3'i;, \C^\ >k]< 26'^^^^ + 2e~^^^^^)^ + 2e-^(i°s(s'")/^'). 

For the lower bound, let a — l/log(£:^n) and p — ae^. We will approximate the size of the 
component Cj of the vertex activated by the i^^ free step, i — 1, ...,0.9pn. If the secondary 
stack becomes empty earlier, Cj set to be empty. To be more precise, the following auxiliary 
branching process is needed. Initially, there is one organism. Generally, the number of 
children is given by the random variable (1 — Y)X — Y where X is a Poisson 1 — (1 + 2a)e 
random variable and Y is an independent Bernoulli random variable with Pr[F = 1] 2a£^. 
In other words, the population is given by Zq = 1, and Zj — Zj^i + (1 — Yj)Xj — 1 — Yj, 
where (Xj, Yj) arc i.i.d random variables with the same distribution as (X, Y). The branching 
process ends when Zj < 0. We will couple the branching process with Ci so that, under 
certain conditions that hold with sufficiently high probability, \Ci\ is at least the sum of 
(1 - YjYs over j with Z^ > for all £ < j. 

To estimate Cj's, let Aj be the cut-off value at the beginning of the i*^ free step. Then 
the i^^ free step will generate Poisson Aj active clones. In a subsequent step of the i^^ 
branching process, an active clone x and the largest unmatched clone excluding x, say y, are 
to be matched. If the cut-off value A were (1 — ^)A, A{9) ^ 0, and V{9) were given at the 
beginning of the step, then we may lower bound the probability that y has not been activated. 
(If A{9) — 0, the branching process ends.) Clearly, the set of vertices that have not been 
activated at the beginning of the step contains V* := V{9) \ Vp. Thus, the probability is at 
least as large as the probability that a clone of a vertex in V* is larger than all of A{9) — 1 
currently active clones excluding x. Notice that the largest number assigned to clones of 
vertices of V* is less than (1 — t)A with probability e""*^'^*' and the corresponding density 
function is A|y*|e-*^l^*l. Using \V*\ > \V{e)\ - pn, we have 

Pr[y has not been activated] > / (1 - t)l^(^)l-^A|l/*|e-*^l^*l (it 

Jo 

POO 

> l-A|i(^)||V*| / 
Jo 

^ 1_W)1>1 1^(^)1 



A|l^*| - A{\V{e)\-pn) 

Conditioned that y has not been active, the number of new active clones is a Poisson A' 
random variable, where A' is the cut-off value at the end of the step. 
Provided the i^^ free step started and 



A{\V{e)\-pn) 



< 2ae\ and A' > 1 - (1 + 2a)e, 



the number of active clones after the end of the step is at least Aiff) -|- (1 — Y^Xj — \ — Yy 
Moreover, in each step 1^ = 0, a distinct vertex is added to eventually form the component 
Ci- Thus, 

\Ci\ > - Y,)l{Ze > 0, V £ < j), (8.4) 

j>0 
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where Zq = 1 and, for j > 1, Zj = Zj_i + (1 - Yj)Xj -1 -Yj, or Z^ = 1 + Ei=i(l - - 
1 — Yj. Even if any of the conditions is not satisfied, we still define the auxiliary branching 
process exactly the same way. However, the coupling and the inequality it implies are no 
longer true. 

Let 9^ = lAae. We will show that the following events occur with large enough probabil- 
ity. 

{i)rp < e^, (u) \Vp{e;)\ > O.dpn, (m) |t>(6'J| > 0.9n, (iv) max B{e) < O.lpn, 

and (f ) the number of clones of vertices in Vp is less than l.lpn. Notice that (i) fl (ii) implies 
that there are at least 0.9pn free steps, and (i) gives that the cut-off value never becomes 
less than 1 — (1 + 2a)6 during the entire 0.9pn branching processes. As A{6) with ^ < fp is 
at most B{9) plus the number of clones of vertices in Vp, {i) fl {iii) fl (iv) fl (f ) implies that 

\A(e)\ 1.2pn „ 



A{\V{e)\-pn) ~ il-ejX{0.9n- pn) 
Therefore, (18. 4p gives 

Pr[|Ci| <k, = 1, 0.9pn] < Prh(i)] + Fi[^{ii)] + Prh(m)] + Pi[^{iv)] + Prh(t;)] 

+ Pr [5^(1 - Y,)l{Ze >0,W< j) < kj'''"^ . 

First, Lemma [8.41 gives 

Pr[^(z)] = Pr[fp > e^] < Pr[5(6'J > -1.1(1 - + 2e-^'^P^\ 

Since -1.1(1 - e^)Xe-^i^pn > -1.2pXn = -1.2ae^Xn, and 

-b{e^) = -(1 - ej{l - 0, - e"^i^)An > (1 - ejeO.Xn > 1.3ae^Xn, 
Corollary 18.31 gives 

Pr[i?(6'J > -1.1(1 - < Pr[5(6'J - 6(6'J > O.lae^Xn] < 2e-^("^'"). 

Appealing Corollary 18.31 and using e^^^^ = 1 + o(l), we have 

Prh(iz)] < 2e-^(""'"), and Pi[^{iii)] < 2e-^("). 
For (iv), Lemma [8.41 and b{6) < for > yield 

2 

PiHiv)] < Pr[ max \B(9) - bm > O.lpn] < 2e~^^^^ = 2e-^("^'"). 

Clearly, Pt[-^{v)] < 2e-^^f"^ = 2e-^("^'") as the number is a Poisson pXn random variable. 
Therefore, for 

\og{e^n) — 2.51oglog(e^ra) — c 



k 



{s + log(l - e)) 
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Pr[|Ci| <k\/i = l,...,0.9pn] < 2e ^^i^^^^ + Pr [ ^(1 - > 0, Wi < j) < k 



O.Qpn 



Hence, it is enough to show that 

Pr [5^(1-F,)1(Z,>0, W<j)<k 



0.9pn 



(l - Pr [5^(1 - > 0, V£ < j) > A;] ) 



i>o 



0.9pn 



< e 



Recalhng = 1 + X]j=i(l ^ ~ ^ ~ '^j^ observe that, conditioned on Yi = ■ ■ ■ = 

Ffc+i = 0, Zi = 1 + Yl^j=i^j ~ ^ = 1? ^ + 1, which is exactly the population for the 
Poisson branching process with mean number of children 1 — (1 + 2a)e. As 



PrfFi = 
it follows that 



Yk+i = 0] = (1 - 2ae 



2\k+l 



;i + o(l))e 



-2ake'^ 



(l + o(l))e 



-2 



Pr [^(1 - Yj)i{Zi > 0, \/e < j) > k = 



j>0 



,({l+2a)e+log(l-(l+2a)e))fc 



Using k = e(£:~2log(e^n)) and (1 + 2a)e + log(l - (1 + 2a)e) > (1 + 3a) (e + log(l - e)) for 
e < 0.01, we further have 



Pr ^^{l-Yj)l{Ze > 0, V£ < j) > k 
and hence 



g-(l+3a)(log(e3n)-2.51og \og{e^n)-c) 



1 - Pr [5^(1 - Yj)liZe > 0, V£ < j) > k 

j>0 



n] 



0.9pn 



< e 



Inside window: Suppose X = 1 + e with \e\ = 0{n~^^^). For a large enough constant K, we 
may sandwich Gpcin,p) between Gpc{n,pJ and Gpc{n^p^)^ where Pj(n — 1) = 1 — Kn~^^^ 
and ^2(71 — 1) = 1 + Kn~^/^. Thus, Gpc{n,p) has a component of size ^(/^"^r;,^/^ log i^') 
and no component of size 0{Kn^^^). □ 



9 Closing Remarks 

The Poisson A-cell is introduced to analyze properties of Gpc{n,p), in which degrees are 
i.i.d Poisson random variables with mean A = p{n — 1). Then various nice properties of 
Poisson random variables are used to analyze sizes of the largest component and the t- 
core of Gpc{n,p). We believe that the approaches presented in this paper are useful to 
analyze problems with similar flavors, especially problems related to branching processes. 
For example, we can easily modify the proofs of Theorem 11.71 to analyze the pure literal 
algorithm for the random fc-SAT problems, k > 3. Another example may be the Karp-Sipser 
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algorithm to find a large matching of the random graph. (See [l9l[7].) In a subsequent paper, 
we will analyze the structures of the 2-core of G(n, p) and the largest strong component of the 
random directed graph as well as the pure literal algorithm for the random 2-SAT problem. 

For the random (hyper)graph with a given sequence (di), we may also introduce the 
((ij)-cell, in which the vertex v. has di clones and each clone is assigned a uniform random 
real number between and the average degree ^ ^^=o^ di. Though it is not possible to use 
all of the nice properties of Poisson random variables any more, we believe that the ((ij)-cell 
equipped with the cut-off line algorithm can be used to prove stronger results for the t-core 
problems considered in various papers including [221 [Ml HSl [58] . The case of the random 
/c-SAT problem conditioned on given degree sequence can be similarly analyzed. 

A better algorithm called the unit clause algorithm for random fc-SAT problems is known. 
(See e.g. [H [21 [3l [191 [3H]) We believe, the unit clause algorithm and some of its variations 
can be analyzed using the Poisson cloning model equipped with the cut-off line algorithm. 

Recall that the degrees in G{n,p) has the binomial distribution with parameters n — 1 and 
p. By introducing the Poisson cloning model, we somehow first take the limit of the binomial 
distribution, which is the Poisson distribution. In general, many limiting distributions like 
Poisson and Gaussian ones have nice properties. In our opinion, this is because various 
small differences are eliminated by taking the limits, and limiting distributions have some 
symmetric and/or invariant properties. Thus, it may be natural to wonder if there is an 
infinite graph that shares most properties of the random graphs G{n,p) with large enough 
n. So, in a sense, the infinite graph, if exists, can be regarded as the limit of G{n,p). An 
infinite graph which Aldous [5j considered to solve the linear assignment problem may or 
may not be a (primitive) version of such an infinity graph. Though it may be impossible to 
construct such a graph, the approaches taken in this paper might be useful to find one, if 
any. 

Acknowledgement. The author thanks C. Borgs, J. Chayes, B. Bolloas and Y. Peres for 
helpful discussions. 
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Appendix: The proof of Lemma 17.1 



For the proof of Lemma 17.11 observing that 

m m 

\vm = ^ \c,\i{Dm ep) = -j2 m{e) e p). 



m 

1=1 i=l 



Lemma 15.21 yields the following corollary. 

Corollary 9.1 For 9 in the range 9^< 9 < 1 and with the same hypotheses as in the main 
lemma, 

Vi\\Vp{9)\ - p{9)n\ > A] < 2e"^^"'"^^'^A)^". 
As in Example 14. 4|, Lemma [4.51 yields a concentration inequality for all of V{9ys: 
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Lemma 9.2 With the same hypotheses as in the main lemma, 



Pr 



max 



\V{9)\-p{9)n 



< A 



< 2e 



_f7(min{A,^^A_|) 



Proof. Observing that, for < < 1, 

'^\\v{e)\-p{e)n)\ < \\v{ej\-p{e,)n\ + \\v{e,)\ - '^\v{e)\ 

we set r(^) = \ \v{e)\ -p{e)n\, 

\ym~'^\v{e)\ 

Clearly, r(^) < ip + ipe. Corollary 19. II gives 

Pr[V;>A/2]<2e-"^"^''^^^'l4)^^\ 



^ = ^J(^r(^J, and V^e = ^ 



(9.1) 



Suppose {X0I := V{9')}g<0i<i is given, especially V{9) is given. Then, since P is increasing, 
we may write |V^(6'J| as 

im)i= E mimo,)eP), 



with 



i:CiCV{e) 



Pr[A(^J G P\{Xe>}e'<e] = Pr[A(^J G P\C, C 1^(0)] 



p{0) 



for i with Cj C ^(0). Lemma [5.21 then gives 



(9.2) 

Lemma [4.51 together with (19. ip and (19.21) yields the desired inequality. 

□ 

We now estimate M{0). First, since Yliv&/ ^vi9) is a Poisson random variable with mean 

(1 - e)\n, 



Pr 



For the second sum, observe that 

m 



(9.3) 



v&V 1=1 v&d 

is a sum of m i.i.d random variables with 



dM)i{Di{e) ^p)]=e\Y, dM - ( dMY{D,{G) G = {e\-q{e))\c, 
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Moreover, since P is an increasing property and Yliv^d^^^^) ^ Poisson 6'A|Ci| random 
variable, FKG inequality (see e.g. Chapter 6 of [6]) gives 



for all fixed e.g. i = 1, 2, 3. Thus one may take = 1 and Oj, hi = 6(1 — p{0)) to satisfy 
all the conditions to apply the generalized Chernoff bound and to obtain 

Pr [ |^rf„(^)l(t; ^ V(^)) - (0A-g(0))n >A < 2e"^^°''°^'^'TT=^^\ 

provided \Ci\ = 0(1). This together with ( 19. 3p implies that if \Ci\ = 0(1) and 1 — p{6) = 
0(1-6), then 



Pr 



M{9) - {X-q{9))n 



> A 



^ 2g-f^(mm{A,^^})_ 



(9.4) 



We proof a concentration result for all of M{6) that together with the cut-off lemma 
implies Lemma [7.11 follows. 



Lemma 9.3 With the same hypotheses as in the main lemma, 



Pr 



max 



M(9) -{X-q{9))n 



< A 



< 2e 



_n{mm{A,^^A_|) 



Proof. Clearly, 

\M{e) - (A - qie))n\ < \M{ej - (A - q{ej)n\ + \M{e,) - M{e) - {q{e) - qie,))n\. 

Let r(^) = \M{e) - (A - q{e))n\, 

^ = Ti9J, = \M{9J - M{e) - {q{e) - g(^^J)n|, 
and $0 is the event \V{e) - p{e)n\ < Then, (El) gives 

Pr[^> A/2]<2e"''^'"'°^^'l4)^". 
For ^e, suppose {Xg/ := (y{9'), M{9'))}g<0><i is given. Using 

M{e) = Y,dv{e) + do{e)i{v ^ v{9)) = Y,d,{i) - d,{d)i{v e v{e)), 



we obtain 

M{e,)-M{9) = Y,d,{e)i{vev{e))-dM)Kvev{e,)) 



m 



J2 ( E ^ \/(^)) - ( 5^ dM)m ^ v{e,)). 



i=l vGd 
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Once V{9) is given, the distributions of 

Yr.= {Y. dv{0))m c v{e)) - ( J2 dM)m q ^(^j) 

veCi veCi 
depend on neither {M{9')}g<g/<i nor {V{9')}g^o'<i and hence, for Cj C V{9), 



E 



{Xe'} 



e<e'<i 



e[J2 dM - ( d.{9S)l{v e V{9,))\Ci C V{9) 
q{9) - q{9,) 



p{9) 

If Cj 2 V{9), Yi — since P is increasing. 
Also, for Cj C y(^), we may write 



\a 



and 



Y/ 



CiCV{9)\ < 2E[(^J2dv{d)-dMy\CiCV{9) 



v&Ci 



+2e[I^Y1 dMm % ym^^i q n^)] ■ 



veCi 



For J = 1,2, 3, 

^[ <P{0)-'E[(^Y.dv{0)-dM)yj] = 0{9-9,) = 0(1-9,) 



for p{9) > p{9-^ = 0(1) and X]?,eCi '^f (^) ~ c^ijI^J is a Poisson random variable with mean 
(^^ — ^^JA|Ci| = 0{9 — 9-^). For the second term, FKG inequality gives 

E[{^Y.'^Mm%v{9,))y\c.^v{9)\ < ( J]d,(^ji(a2y(^j))'] 

< pi9r'E[ (^J2dv{0.)y]E[liagV{9,))] 



veCi 



veCi 

0{l-p{9,))^0{l-9,), 



for j = 1,2,3. Therefore, 



E 



(Yi-E[Yi\y\{Xe'}e<e'<i\ < W<i] ^0(1-9,). 



Similarly, for ^ in the range |.^| < = 1, it is not hard to show 



E 



{Y, - £;[y,])3e«(^'-^[^d) {Xo>}e<e'<i] = 0{1 - 9,) 
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Applying the generalized Chernoff bound, we have 



Pr 



[IE 



p{9) 



-\Vie)\ > A/A {Xg,}e<e' 



<i 



< 2e 



-0(min{A, 



Finally, as the event $9 guarantees 

q{e) - q{e,) 



p{0) 

for p{0^) < p{0) and q{6) < A, we have 



\V{9) \ -p{9)n < A/4 



Pr [ I 5^ - {q{e) - q{e,))n > A/2 {Xg,}e<9' 



< 



Pr 



-11^(0)1 >A/A{Xe>}e<e' 



<i 



<i 



Lemma [4.51 yields the desired inequality. 



58 



